Re: [PATCH] x86, sched: Fix the AMD CPPC maximum perf on some specific generations
On Tue, Apr 20, 2021 at 04:22:31PM +0800, Borislav Petkov wrote: > On Tue, Apr 20, 2021 at 04:09:43PM +0800, Huang Rui wrote: > > Some AMD Ryzen generations has different calculation method on maximum > > perf. 255 is not for all asics, some specific generations used 166 as > > the maximum perf. This patch is to fix the different maximum perf value > > Avoid having "This patch" or "This commit" in the commit message. It is > tautologically useless. > > Also, do > > $ git grep 'This patch' Documentation/process > > for more details. Thanks and good to know, I will enhance the commit message in V2. > > > of AMD CPPC. > > > > Fixes: 41ea667227ba ("x86, sched: Calculate frequency invariance for AMD > > systems") > > Fixes: 3c55e94c0ade ("cpufreq: ACPI: Extend frequency tables to cover boost > > frequencies") > > > > Reported-by: Jason Bagavatsingham > > Tested-by: Jason Bagavatsingham > > Signed-off-by: Huang Rui > > Cc: Alex Deucher > > Cc: Nathan Fontenot > > Cc: Rafael J. Wysocki > > Cc: Borislav Petkov > > Cc: x...@kernel.org > > --- > > arch/x86/kernel/smpboot.c | 33 ++- > > drivers/cpufreq/acpi-cpufreq.c | 41 ++ > > 2 files changed, 73 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > > index 02813a7f3a7c..705bc5ceb1ea 100644 > > --- a/arch/x86/kernel/smpboot.c > > +++ b/arch/x86/kernel/smpboot.c > > @@ -2033,6 +2033,37 @@ static bool intel_set_max_freq_ratio(void) > > } > > > > #ifdef CONFIG_ACPI_CPPC_LIB > > +static u64 amd_get_highest_perf(void) > > +{ > > struct cpuinfo_x86 *c = _cpu_data; > > and then you can use "c" everywhere. > > > + u64 cppc_max_perf; > > u64 for something which fits in a byte? > > Also, > max_perf = 255; > > and you can remove the else and default branches below. I aligned with highest_perf type in get_max_boost_ratio() funciton. Will clean the "else" and "default" branches in V2. > > > + > > + switch (boot_cpu_data.x86) { > > + case 0x17: > > + if ((boot_cpu_data.x86_model >= 0x30 && > > +boot_cpu_data.x86_model < 0x40) || > > + (boot_cpu_data.x86_model >= 0x70 && > > +boot_cpu_data.x86_model < 0x80)) > > + cppc_max_perf = 166; > > + else > > + cppc_max_perf = 255; > > + break; > > + case 0x19: > > + if ((boot_cpu_data.x86_model >= 0x20 && > > +boot_cpu_data.x86_model < 0x30) || > > + (boot_cpu_data.x86_model >= 0x40 && > > +boot_cpu_data.x86_model < 0x70)) > > + cppc_max_perf = 166; > > + else > > + cppc_max_perf = 255; > > + break; > > + default: > > + cppc_max_perf = 255; > > + break; > > + } > > + > > + return cppc_max_perf; > > +} > > Why is this here and not in arch/x86/kernel/cpu/amd.c? OK, I will modify to abstract this function in amd.c and then call it both on smpboot and acpi-cpufreq. > > > + > > static bool amd_set_max_freq_ratio(void) > > { > > struct cppc_perf_caps perf_caps; > > > > > @@ -2046,8 +2077,8 @@ static bool amd_set_max_freq_ratio(void) > > return false; > > } > > > > - highest_perf = perf_caps.highest_perf; > > nominal_perf = perf_caps.nominal_perf; > > + highest_perf = amd_get_highest_perf(); > > > > if (!highest_perf || !nominal_perf) { > > pr_debug("Could not retrieve highest or nominal performance\n"); > > diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c > > index d1bbc16fba4b..e5c03360db20 100644 > > --- a/drivers/cpufreq/acpi-cpufreq.c > > +++ b/drivers/cpufreq/acpi-cpufreq.c > > @@ -630,6 +630,44 @@ static int acpi_cpufreq_blacklist(struct cpuinfo_x86 > > *c) > > #endif > > > > #ifdef CONFIG_ACPI_CPPC_LIB > > + > > +static u64 get_amd_max_boost_ratio(unsigned int cpu, u64 nominal_perf) > > +{ > > + u64 boost_ratio, cppc_max_perf; > > + > > + if (!nominal_perf) > > + return 0; > > + > > + switch (boot_cpu_data.x86) { > > + case
[PATCH] x86, sched: Fix the AMD CPPC maximum perf on some specific generations
Some AMD Ryzen generations has different calculation method on maximum perf. 255 is not for all asics, some specific generations used 166 as the maximum perf. This patch is to fix the different maximum perf value of AMD CPPC. Fixes: 41ea667227ba ("x86, sched: Calculate frequency invariance for AMD systems") Fixes: 3c55e94c0ade ("cpufreq: ACPI: Extend frequency tables to cover boost frequencies") Reported-by: Jason Bagavatsingham Tested-by: Jason Bagavatsingham Signed-off-by: Huang Rui Cc: Alex Deucher Cc: Nathan Fontenot Cc: Rafael J. Wysocki Cc: Borislav Petkov Cc: x...@kernel.org --- arch/x86/kernel/smpboot.c | 33 ++- drivers/cpufreq/acpi-cpufreq.c | 41 ++ 2 files changed, 73 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 02813a7f3a7c..705bc5ceb1ea 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -2033,6 +2033,37 @@ static bool intel_set_max_freq_ratio(void) } #ifdef CONFIG_ACPI_CPPC_LIB +static u64 amd_get_highest_perf(void) +{ + u64 cppc_max_perf; + + switch (boot_cpu_data.x86) { + case 0x17: + if ((boot_cpu_data.x86_model >= 0x30 && +boot_cpu_data.x86_model < 0x40) || + (boot_cpu_data.x86_model >= 0x70 && +boot_cpu_data.x86_model < 0x80)) + cppc_max_perf = 166; + else + cppc_max_perf = 255; + break; + case 0x19: + if ((boot_cpu_data.x86_model >= 0x20 && +boot_cpu_data.x86_model < 0x30) || + (boot_cpu_data.x86_model >= 0x40 && +boot_cpu_data.x86_model < 0x70)) + cppc_max_perf = 166; + else + cppc_max_perf = 255; + break; + default: + cppc_max_perf = 255; + break; + } + + return cppc_max_perf; +} + static bool amd_set_max_freq_ratio(void) { struct cppc_perf_caps perf_caps; @@ -2046,8 +2077,8 @@ static bool amd_set_max_freq_ratio(void) return false; } - highest_perf = perf_caps.highest_perf; nominal_perf = perf_caps.nominal_perf; + highest_perf = amd_get_highest_perf(); if (!highest_perf || !nominal_perf) { pr_debug("Could not retrieve highest or nominal performance\n"); diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c index d1bbc16fba4b..e5c03360db20 100644 --- a/drivers/cpufreq/acpi-cpufreq.c +++ b/drivers/cpufreq/acpi-cpufreq.c @@ -630,6 +630,44 @@ static int acpi_cpufreq_blacklist(struct cpuinfo_x86 *c) #endif #ifdef CONFIG_ACPI_CPPC_LIB + +static u64 get_amd_max_boost_ratio(unsigned int cpu, u64 nominal_perf) +{ + u64 boost_ratio, cppc_max_perf; + + if (!nominal_perf) + return 0; + + switch (boot_cpu_data.x86) { + case 0x17: + if ((boot_cpu_data.x86_model >= 0x30 && +boot_cpu_data.x86_model < 0x40) || + (boot_cpu_data.x86_model >= 0x70 && +boot_cpu_data.x86_model < 0x80)) + cppc_max_perf = 166; + else + cppc_max_perf = 255; + break; + case 0x19: + if ((boot_cpu_data.x86_model >= 0x20 && +boot_cpu_data.x86_model < 0x30) || + (boot_cpu_data.x86_model >= 0x40 && +boot_cpu_data.x86_model < 0x70)) + cppc_max_perf = 166; + else + cppc_max_perf = 255; + break; + default: + cppc_max_perf = 255; + break; + } + + boost_ratio = div_u64(cppc_max_perf << SCHED_CAPACITY_SHIFT, + nominal_perf); + + return boost_ratio; +} + static u64 get_max_boost_ratio(unsigned int cpu) { struct cppc_perf_caps perf_caps; @@ -646,6 +684,9 @@ static u64 get_max_boost_ratio(unsigned int cpu) return 0; } + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) + return get_amd_max_boost_ratio(cpu, perf_caps.nominal_perf); + highest_perf = perf_caps.highest_perf; nominal_perf = perf_caps.nominal_perf; -- 2.25.1
Re: [PATCH 2/2] drm/ttm: optimize the pool shrinker a bit v2
On Thu, Apr 15, 2021 at 07:56:24PM +0800, Christian König wrote: > Switch back to using a spinlock again by moving the IOMMU unmap outside > of the locked region. > > v2: Add a comment explaining why we need sync_shrinkers(). > > Signed-off-by: Christian König Series look good for me as well. Acked-by: Huang Rui > --- > drivers/gpu/drm/ttm/ttm_pool.c | 44 +- > 1 file changed, 22 insertions(+), 22 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c > index cb38b1a17b09..955836d569cc 100644 > --- a/drivers/gpu/drm/ttm/ttm_pool.c > +++ b/drivers/gpu/drm/ttm/ttm_pool.c > @@ -70,7 +70,7 @@ static struct ttm_pool_type global_uncached[MAX_ORDER]; > static struct ttm_pool_type global_dma32_write_combined[MAX_ORDER]; > static struct ttm_pool_type global_dma32_uncached[MAX_ORDER]; > > -static struct mutex shrinker_lock; > +static spinlock_t shrinker_lock; > static struct list_head shrinker_list; > static struct shrinker mm_shrinker; > > @@ -263,9 +263,9 @@ static void ttm_pool_type_init(struct ttm_pool_type *pt, > struct ttm_pool *pool, > spin_lock_init(>lock); > INIT_LIST_HEAD(>pages); > > - mutex_lock(_lock); > + spin_lock(_lock); > list_add_tail(>shrinker_list, _list); > - mutex_unlock(_lock); > + spin_unlock(_lock); > } > > /* Remove a pool_type from the global shrinker list and free all pages */ > @@ -273,9 +273,9 @@ static void ttm_pool_type_fini(struct ttm_pool_type *pt) > { > struct page *p; > > - mutex_lock(_lock); > + spin_lock(_lock); > list_del(>shrinker_list); > - mutex_unlock(_lock); > + spin_unlock(_lock); > > while ((p = ttm_pool_type_take(pt))) > ttm_pool_free_page(pt->pool, pt->caching, pt->order, p); > @@ -313,24 +313,19 @@ static struct ttm_pool_type > *ttm_pool_select_type(struct ttm_pool *pool, > static unsigned int ttm_pool_shrink(void) > { > struct ttm_pool_type *pt; > - unsigned int num_freed; > struct page *p; > > - mutex_lock(_lock); > + spin_lock(_lock); > pt = list_first_entry(_list, typeof(*pt), shrinker_list); > + list_move_tail(>shrinker_list, _list); > + spin_unlock(_lock); > > p = ttm_pool_type_take(pt); > - if (p) { > - ttm_pool_free_page(pt->pool, pt->caching, pt->order, p); > - num_freed = 1 << pt->order; > - } else { > - num_freed = 0; > - } > - > - list_move_tail(>shrinker_list, _list); > - mutex_unlock(_lock); > + if (!p) > + return 0; > > - return num_freed; > + ttm_pool_free_page(pt->pool, pt->caching, pt->order, p); > + return 1 << pt->order; > } > > /* Return the allocation order based for a page */ > @@ -530,6 +525,11 @@ void ttm_pool_fini(struct ttm_pool *pool) > for (j = 0; j < MAX_ORDER; ++j) > ttm_pool_type_fini(>caching[i].orders[j]); > } > + > + /* We removed the pool types from the LRU, but we need to also make sure > + * that no shrinker is concurrently freeing pages from the pool. > + */ > + sync_shrinkers(); > } > > /* As long as pages are available make sure to release at least one */ > @@ -604,7 +604,7 @@ static int ttm_pool_debugfs_globals_show(struct seq_file > *m, void *data) > { > ttm_pool_debugfs_header(m); > > - mutex_lock(_lock); > + spin_lock(_lock); > seq_puts(m, "wc\t:"); > ttm_pool_debugfs_orders(global_write_combined, m); > seq_puts(m, "uc\t:"); > @@ -613,7 +613,7 @@ static int ttm_pool_debugfs_globals_show(struct seq_file > *m, void *data) > ttm_pool_debugfs_orders(global_dma32_write_combined, m); > seq_puts(m, "uc 32\t:"); > ttm_pool_debugfs_orders(global_dma32_uncached, m); > - mutex_unlock(_lock); > + spin_unlock(_lock); > > ttm_pool_debugfs_footer(m); > > @@ -640,7 +640,7 @@ int ttm_pool_debugfs(struct ttm_pool *pool, struct > seq_file *m) > > ttm_pool_debugfs_header(m); > > - mutex_lock(_lock); > + spin_lock(_lock); > for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i) { > seq_puts(m, "DMA "); > switch (i) { > @@ -656,7 +656,7 @@ int ttm_pool_debugfs(struct ttm_pool *pool, struct > seq_file *m) > } > ttm_pool_debugfs_orders(pool->caching[i].orders, m); > } > - mutex_unlock(_lock); > + spin_unlock(_lock); > > ttm_pool_debugfs_footer(m); > return 0; > @@ -693,7 +693,7 @@ int ttm_pool_mgr_init(unsigned long num_pages) > if (!page_pool_size) > page_pool_size = num_pages; > > - mutex_init(_lock); > + spin_lock_init(_lock); > INIT_LIST_HEAD(_list); > > for (i = 0; i < MAX_ORDER; ++i) { > -- > 2.25.1 >
Re: your mail
On Wed, Apr 07, 2021 at 09:27:46AM +0800, songqiang wrote: Please add the description in the commit message and subject. Thanks, Ray > Signed-off-by: songqiang > --- > drivers/gpu/drm/ttm/ttm_page_alloc.c | 18 ++ > 1 file changed, 14 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c > b/drivers/gpu/drm/ttm/ttm_page_alloc.c > index 14660f723f71..f3698f0ad4d7 100644 > --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c > +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c > @@ -736,8 +736,16 @@ static void ttm_put_pages(struct page **pages, unsigned > npages, int flags, > if (++p != pages[i + j]) > break; > > - if (j == HPAGE_PMD_NR) > + if (j == HPAGE_PMD_NR) { > order = HPAGE_PMD_ORDER; > + for (j = 1; j < HPAGE_PMD_NR; ++j) > + page_ref_dec(pages[i+j]); > + } > } > #endif > > @@ -868,10 +876,12 @@ static int ttm_get_pages(struct page **pages, unsigned > npages, int flags, > p = alloc_pages(huge_flags, HPAGE_PMD_ORDER); > if (!p) > break; > - > - for (j = 0; j < HPAGE_PMD_NR; ++j) > + for (j = 0; j < HPAGE_PMD_NR; ++j) { > pages[i++] = p++; > - > + if (j > 0) > + page_ref_inc(pages[i-1]); > + } > npages -= HPAGE_PMD_NR; > } > } > > > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-develdata=04%7C01%7Cray.huang%40amd.com%7C4ccc617b77d746db5af108d8f98db612%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533734805563118%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=9bSP90LYdJyJYJYmuphVmqk%2B3%2FE4JPrtXkQTbxwAt68%3Dreserved=0
Re: [PATCH 0/3] iommu/amd: Fix booting with amd_iommu=off
On Wed, Mar 17, 2021 at 05:10:34PM +0800, Joerg Roedel wrote: > From: Joerg Roedel > > Hi, > > it turned out that booting a kernel with amd_iommu=off on a machine > that has an AMD IOMMU causes an early kernel crash. There are two > reasons for this, and fixing one of them is already sufficient, but > both reasons deserve fixing, which is done in this patch-set. > > Regards, > > Joerg > > Joerg Roedel (3): > iommu/amd: Move Stoney Ridge check to detect_ivrs() > iommu/amd: Don't call early_amd_iommu_init() when AMD IOMMU is > disabled > iommu/amd: Keep track of amd_iommu_irq_remap state Series are Acked-by: Huang Rui > > drivers/iommu/amd/init.c | 36 > 1 file changed, 20 insertions(+), 16 deletions(-) > > -- > 2.30.2 >
Re: [PATCH] iommu/amd: Fix iommu remap panic while amd_iommu is set to disable
On Tue, Mar 16, 2021 at 11:00:26PM +0800, Joerg Roedel wrote: > On Tue, Mar 16, 2021 at 09:36:02PM +0800, Huang Rui wrote: > > Thanks for the comments. Could you please elaborate this? > > > > Do you mean while amd_iommu=off, we won't prepare the IVRS, and even > > needn't get all ACPI talbes. Because they are never be used and the next > > state will always goes into IOMMU_CMDLINE_DISABLED, am I right? > > The first problem was that amd_iommu_irq_remap is never set back to > false when irq-remapping initialization fails in amd_iommu_prepare(). > > But there are other problems, like that even when the IOMMU is set to > disabled on the command line with amd_iommu=off, the code still sets up > all IOMMUs and registers IRQ domains for them. > Yes, that was the problem I found in my side. No very clear about the orignal intention. But if you said this is a problem, that makes sense... > Later the code checks wheter the IOMMU should stay disabled and tears > everything down, except for the IRQ domains, which stay in the global > list. > > The APIs do not really support tearing down IRQ domains well, so its not > so easy to add this to the tear-down path. Now that the IRQ domains stay > in the list, the ACPI code will come along later and calls the > ->select() call-back for every IRQ domain, which gets execution to > irq_remapping_select(), depite IOMMU being disabled and > amd_iommu_rlookup_table already de-allocated. But since > amd_iommu_irq_remap is still true the NULL pointer is dereferenced, > causing the crash. OK. We found some memleaks here before as well. It looks cause by iommu data buffer initialization and not be cleared. And yes, if setup iommu here, it actually causes many issues. unreferenced object 0x888332047500 (size 224): comm "swapper/0", pid 0, jiffies 4294892296 (age 362.192s) hex dump (first 32 bytes): b0 22 03 00 00 00 00 00 00 00 00 40 00 00 00 00 .".@ 06 00 00 00 00 00 00 00 00 20 00 00 00 20 00 00 . ... .. backtrace: [<8f162024>] kmem_cache_alloc+0x145/0x440 [<420093ba>] kmem_cache_create_usercopy+0x127/0x2c0 [<f5ed7ff0>] kmem_cache_create+0x16/0x20 [<4c1e4f47>] iommu_go_to_state+0x8e4/0x165d [<a191b705>] amd_iommu_prepare+0x1a/0x2f [<afe5b97e>] irq_remapping_prepare+0x35/0x6d [<209f36b5>] enable_IR_x2apic+0x2b/0x1ae [<b64491b5>] default_setup_apic_routing+0x16/0x65 [<e89c64a1>] apic_intr_mode_init+0x81/0x10a [<eaa14bf6>] x86_late_time_init+0x24/0x35 [<e291fc71>] start_kernel+0x509/0x5c7 [<99930dfe>] x86_64_start_reservations+0x24/0x26 [<b369765d>] x86_64_start_kernel+0x75/0x79 [<4f411919>] secondary_startup_64_no_verify+0xb0/0xbb > > When the IRQ domains would not be around, this would also not happen. So > my patches also change the initializtion to not do all the setup work > when amd_iommu=off was passed on the command line. > Thanks for the fix and explaination. :-) Best Regards, Ray
Re: [PATCH] iommu/amd: Fix iommu remap panic while amd_iommu is set to disable
On Tue, Mar 16, 2021 at 09:16:34PM +0800, Joerg Roedel wrote: > Hi Huang, > > On Thu, Mar 11, 2021 at 10:28:07PM +0800, Huang Rui wrote: > > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c > > index f0adbc48fd17..a08e885403b7 100644 > > --- a/drivers/iommu/amd/iommu.c > > +++ b/drivers/iommu/amd/iommu.c > > @@ -3862,7 +3862,7 @@ static int irq_remapping_select(struct irq_domain *d, > > struct irq_fwspec *fwspec, > > else if (x86_fwspec_is_hpet(fwspec)) > > devid = get_hpet_devid(fwspec->param[0]); > > > > - if (devid < 0) > > + if (devid < 0 || !amd_iommu_rlookup_table) > > return 0; > > The problem is deeper than this fix suggests. I prepared other fixes for > this particular problem. Please find them here: > Thanks for the comments. Could you please elaborate this? Do you mean while amd_iommu=off, we won't prepare the IVRS, and even needn't get all ACPI talbes. Because they are never be used and the next state will always goes into IOMMU_CMDLINE_DISABLED, am I right? Thanks, Ray > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fjoro%2Flinux.git%2Flog%2F%3Fh%3Diommu-fixesdata=04%7C01%7Cray.huang%40amd.com%7Cce731dda3b444ac9a14108d8e87dbb3e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637514974013915073%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=NVahf1tIfgno%2BNWPu8hY4DygiGWdKXBJ8G6OsD%2BHC14%3Dreserved=0 > > Regards, > > Joerg
Re: [PATCH V2] drm: amd: pm: Mundane typo fixes in the file amdgpu_pm.c
On Mon, Mar 15, 2021 at 11:21:36AM +0800, Bhaskar Chowdhury wrote: > > s/"an minimum"/"a minimum"/ > s/"an maxmum"/"a maximum"/ > > Signed-off-by: Bhaskar Chowdhury Reviewed-by: Huang Rui > --- > Changes from V1: > Randy's suggestion to adjust the subject line text > And missed out a spell too,which now included > > drivers/gpu/drm/amd/pm/amdgpu_pm.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c > b/drivers/gpu/drm/amd/pm/amdgpu_pm.c > index 5fa65f191a37..308249ae1a22 100644 > --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c > +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c > @@ -3315,9 +3315,9 @@ static ssize_t amdgpu_hwmon_show_mclk_label(struct > device *dev, > * > * - pwm1_max: pulse width modulation fan control maximum level (255) > * > - * - fan1_min: an minimum value Unit: revolution/min (RPM) > + * - fan1_min: a minimum value Unit: revolution/min (RPM) > * > - * - fan1_max: an maxmum value Unit: revolution/max (RPM) > + * - fan1_max: a maximum value Unit: revolution/max (RPM) > * > * - fan1_input: fan speed in RPM > * > -- > 2.30.2 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-develdata=04%7C01%7Cray.huang%40amd.com%7C7815a224727f4c9b556008d8e76182b4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637513753293707291%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=ApeKXDkijsihPcGSoI9v8ypRcsUXSb2Y7%2FpKsm3c4Xo%3Dreserved=0
[PATCH] iommu/amd: Fix iommu remap panic while amd_iommu is set to disable
While amd_iommu is set to disable in cmd line, it will free the iommu resources. Then the pages of rlookup table is freed as well. If that, we have to check rlookup table in irq_remapping_select(), otherwise, it will trigger a kernel panic below. [2.245855] BUG: kernel NULL pointer dereference, address: 0500 [2.252861] #PF: supervisor read access in kernel mode [2.258053] #PF: error_code(0x) - not-present page [2.263247] PGD 0 P4D 0 [2.265844] Oops: [#1] SMP NOPTI [2.269570] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.11.0-custom #1 [2.276150] Hardware name: AMD Chachani-VN/Chachani-VN, BIOS VCH162755N.FD 03/04/2021 [2.284085] RIP: 0010:irq_remapping_select+0x5c/0xb0 [2.289107] Code: 4b 0c 48 3d 70 7c c8 8f 75 0d eb 35 48 8b 00 48 3d 70 7c c8 8f 74 2a 0f b6 50 10 39 d1 75 ed 0f b7 40 12 48 8b 15 f4 8a 3b 02 <48> 8b 14 c2 48 85 d2 74 0e b8 01 00 00 00 4c 3b a2 90 04 00 00 74 [2.307999] RSP: :8f403d40 EFLAGS: 00010246 [2.313285] RAX: 00a0 RBX: 8f403d98 RCX: 0021 [2.320471] RDX: RSI: 000a RDI: 8cbb4006118a [2.327658] RBP: 8f403d50 R08: 0021 R09: 0002 [2.334838] R10: 000a R11: f000 R12: 8cbb401bfe40 [2.342022] R13: R14: 8cbb401be900 R15: [2.349210] FS: () GS:8cbd73e0() knlGS: [2.357399] CS: 0010 DS: ES: CR0: 80050033 [2.363198] CR2: 0500 CR3: 0002dae0a000 CR4: 000406b0 [2.370382] Call Trace: [2.372897] irq_find_matching_fwspec+0x48/0xd0 [2.377489] mp_irqdomain_create+0x7c/0x180 [2.381736] ? __raw_callee_save___native_queued_spin_unlock+0x15/0x23 [2.388320] setup_IO_APIC+0x81/0x875 [2.392048] ? clear_IO_APIC_pin+0xd6/0x130 [2.396294] ? clear_IO_APIC+0x39/0x60 [2.400103] apic_intr_mode_init+0x107/0x10a [2.404432] x86_late_time_init+0x24/0x35 [2.408507] start_kernel+0x509/0x5c7 [2.412230] x86_64_start_reservations+0x24/0x26 [2.416911] x86_64_start_kernel+0x75/0x79 [2.421068] secondary_startup_64_no_verify+0xb0/0xbb Fixes: a1a785b572425 ("iommu/amd: Implement select() method on remapping irqdomain") Tested-by: Xiaojian Du Signed-off-by: Huang Rui Cc: Joerg Roedel Cc: Suravee Suthikulpanit Cc: Alex Deucher Cc: sta...@vger.kernel.org --- drivers/iommu/amd/iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index f0adbc48fd17..a08e885403b7 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3862,7 +3862,7 @@ static int irq_remapping_select(struct irq_domain *d, struct irq_fwspec *fwspec, else if (x86_fwspec_is_hpet(fwspec)) devid = get_hpet_devid(fwspec->param[0]); - if (devid < 0) + if (devid < 0 || !amd_iommu_rlookup_table) return 0; iommu = amd_iommu_rlookup_table[devid]; -- 2.25.1
Re: [PATCH][next] drm/amdgpu: Fix masking binary not operator on two mask operations
On Fri, Jan 22, 2021 at 11:00:22PM +0800, Colin King wrote: > From: Colin Ian King > > Currently the ! operator is incorrectly being used to flip bits on > mask values. Fix this by using the bit-wise ~ operator instead. > > Addresses-Coverity: ("Logical vs. bitwise operator") > Fixes: 3c9a7b7d6e75 ("drm/amdgpu: update mmhub mgcg for mmhub_v2_3") > Signed-off-by: Colin Ian King Thanks. Reviewed-by: Huang Rui > --- > drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c > b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c > index 1961745e89c7..ab9be5ad5a5f 100644 > --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c > +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c > @@ -531,12 +531,12 @@ mmhub_v2_3_update_medium_grain_light_sleep(struct > amdgpu_device *adev, > > if (enable && (adev->cg_flags & AMD_CG_SUPPORT_MC_LS)) { > data &= ~MM_ATC_L2_CGTT_CLK_CTRL__MGLS_OVERRIDE_MASK; > - data1 &= !(DAGB0_WR_CGTT_CLK_CTRL__LS_OVERRIDE_MASK | > + data1 &= ~(DAGB0_WR_CGTT_CLK_CTRL__LS_OVERRIDE_MASK | > DAGB0_WR_CGTT_CLK_CTRL__LS_OVERRIDE_WRITE_MASK | > DAGB0_WR_CGTT_CLK_CTRL__LS_OVERRIDE_READ_MASK | > DAGB0_WR_CGTT_CLK_CTRL__LS_OVERRIDE_RETURN_MASK | > DAGB0_WR_CGTT_CLK_CTRL__LS_OVERRIDE_REGISTER_MASK); > - data2 &= !(DAGB0_RD_CGTT_CLK_CTRL__LS_OVERRIDE_MASK | > + data2 &= ~(DAGB0_RD_CGTT_CLK_CTRL__LS_OVERRIDE_MASK | > DAGB0_RD_CGTT_CLK_CTRL__LS_OVERRIDE_WRITE_MASK | > DAGB0_RD_CGTT_CLK_CTRL__LS_OVERRIDE_READ_MASK | > DAGB0_RD_CGTT_CLK_CTRL__LS_OVERRIDE_RETURN_MASK | > -- > 2.29.2 >
Re: linux-next: build warning after merge of the amdgpu tree
On Mon, Jan 18, 2021 at 02:52:18PM +0800, Stephen Rothwell wrote: > Hi Ray, > > On Mon, 18 Jan 2021 05:56:20 + "Huang, Ray" wrote: > > > > [AMD Public Use] > > > > Thanks Setphen. Please check attached V2 patch. > > (If you can share the config file to me, I can do the test in my side) > > It looks good, but I did not try to build it. I have attached the > config (it is just a powerpc allyesconfig). > -- > Cheers, > Stephen Rothwell Thanks Stephen. Ray
Re: linux-next: build failure after merge of the amdgpu tree
On Fri, Jan 15, 2021 at 01:35:05PM +0800, Stephen Rothwell wrote: > Hi all, > > After merging the amdgpu tree, today's linux-next build (powerpc > allyesconfig) failed like this: > > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c: In function > 'vangogh_get_smu_metrics_data': > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:300:10: error: > 'boot_cpu_data' undeclared (first use in this function); did you mean > 'boot_cpuid'? Ah, vangogh is an x86 cpu, let me look at this issue. Could you share me the config file you tested? Thanks, Ray > 300 | boot_cpu_data.x86_max_cores * sizeof(uint16_t)); > | ^ > | boot_cpuid > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c: In function > 'vangogh_read_sensor': > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:1320:11: error: > 'boot_cpu_data' undeclared (first use in this function); did you mean > 'boot_cpuid'? > 1320 | *size = boot_cpu_data.x86_max_cores * sizeof(uint16_t); > | ^ > | boot_cpuid > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c: In function > 'vangogh_od_edit_dpm_table': > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:1460:19: error: > 'boot_cpu_data' undeclared (first use in this function); did you mean > 'boot_cpuid'? > 1460 | if (input[0] >= boot_cpu_data.x86_max_cores) { > | ^ > | boot_cpuid > > Caused by commits > > 517cb957c43b ("drm/amd/pm: implement the processor clocks which read by > metric") > 0d90d0ddd10e ("drm/amd/pm: implement processor fine grain feature for > vangogh (v3)") > > The only thing I could do easily is to disable CONFIG_DRM_AMDGPU for today. > > -- > Cheers, > Stephen Rothwell
Re: [PATCH] drm: amdgpu: pm: Mark vangogh_clk_dpm_is_enabled() as static
On Wed, Jan 13, 2021 at 10:21:26AM +0800, Huang Rui wrote: > On Wed, Jan 13, 2021 at 10:13:02AM +0800, Alex Deucher wrote: > > On Tue, Jan 12, 2021 at 8:19 PM Huang Rui wrote: > > > > > > On Wed, Jan 13, 2021 at 03:57:22AM +0800, Souptick Joarder wrote: > > > > kernel test robot throws below warnings -> > > > > > > > > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:594:6: > > > > warning: no previous prototype for 'vangogh_clk_dpm_is_enabled' > > > > [-Wmissing-prototypes] > > > > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:594:6: > > > > warning: no previous prototype for function 'vangogh_clk_dpm_is_enabled' > > > > [-Wmissing-prototypes] > > > > > > > > Mark vangogh_clk_dpm_is_enabled() as static. > > > > > > > > Reported-by: kernel test robot > > > > Signed-off-by: Souptick Joarder > > > > --- > > > > drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c > > > > b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c > > > > index 75ddcad..3ffe56e 100644 > > > > --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c > > > > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c > > > > @@ -610,7 +610,7 @@ static int vangogh_get_profiling_clk_mask(struct > > > > smu_context *smu, > > > > return 0; > > > > } > > > > > > > > -bool vangogh_clk_dpm_is_enabled(struct smu_context *smu, > > > > +static bool vangogh_clk_dpm_is_enabled(struct smu_context *smu, > > > > enum smu_clk_type clk_type) > > > > > > Ah, I have another patch which will use this function in another file. > > > > > > > I can drop it if you plan to land those patches soon. > > Thanks Alex. Yes, I will upload them after verify them on the new firmware > today. Sorry Alex, I miss read the function name as "cclk_dpm". This patch is good, please go forward to apply it. Reviewed-by: Huang Rui Thanks, Ray > > Thanks, > Ray > > > > > Alex > > > > > > > Thanks, > > > Ray > > > > > > > { > > > > enum smu_feature_mask feature_id = 0; > > > > -- > > > > 1.9.1 > > > > > > > ___ > > > amd-gfx mailing list > > > amd-...@lists.freedesktop.org > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7Cray.huang%40amd.com%7C19fce6891f2d4f6df7de08d8b768c8a9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637461007972405505%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=v2JGTkklMHOE2hN1s4dYZ1hT7ctLeUHpwkpn1M3nyi8%3Dreserved=0
Re: [PATCH] drm: amdgpu: pm: Mark vangogh_clk_dpm_is_enabled() as static
On Wed, Jan 13, 2021 at 10:13:02AM +0800, Alex Deucher wrote: > On Tue, Jan 12, 2021 at 8:19 PM Huang Rui wrote: > > > > On Wed, Jan 13, 2021 at 03:57:22AM +0800, Souptick Joarder wrote: > > > kernel test robot throws below warnings -> > > > > > > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:594:6: > > > warning: no previous prototype for 'vangogh_clk_dpm_is_enabled' > > > [-Wmissing-prototypes] > > > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:594:6: > > > warning: no previous prototype for function 'vangogh_clk_dpm_is_enabled' > > > [-Wmissing-prototypes] > > > > > > Mark vangogh_clk_dpm_is_enabled() as static. > > > > > > Reported-by: kernel test robot > > > Signed-off-by: Souptick Joarder > > > --- > > > drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c > > > b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c > > > index 75ddcad..3ffe56e 100644 > > > --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c > > > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c > > > @@ -610,7 +610,7 @@ static int vangogh_get_profiling_clk_mask(struct > > > smu_context *smu, > > > return 0; > > > } > > > > > > -bool vangogh_clk_dpm_is_enabled(struct smu_context *smu, > > > +static bool vangogh_clk_dpm_is_enabled(struct smu_context *smu, > > > enum smu_clk_type clk_type) > > > > Ah, I have another patch which will use this function in another file. > > > > I can drop it if you plan to land those patches soon. Thanks Alex. Yes, I will upload them after verify them on the new firmware today. Thanks, Ray > > Alex > > > > Thanks, > > Ray > > > > > { > > > enum smu_feature_mask feature_id = 0; > > > -- > > > 1.9.1 > > > > > ___ > > amd-gfx mailing list > > amd-...@lists.freedesktop.org > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7Cray.huang%40amd.com%7C19fce6891f2d4f6df7de08d8b768c8a9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637461007972405505%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=v2JGTkklMHOE2hN1s4dYZ1hT7ctLeUHpwkpn1M3nyi8%3Dreserved=0
Re: [PATCH] drm: amdgpu: pm: Mark vangogh_clk_dpm_is_enabled() as static
On Wed, Jan 13, 2021 at 03:57:22AM +0800, Souptick Joarder wrote: > kernel test robot throws below warnings -> > > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:594:6: > warning: no previous prototype for 'vangogh_clk_dpm_is_enabled' > [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:594:6: > warning: no previous prototype for function 'vangogh_clk_dpm_is_enabled' > [-Wmissing-prototypes] > > Mark vangogh_clk_dpm_is_enabled() as static. > > Reported-by: kernel test robot > Signed-off-by: Souptick Joarder > --- > drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c > b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c > index 75ddcad..3ffe56e 100644 > --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c > @@ -610,7 +610,7 @@ static int vangogh_get_profiling_clk_mask(struct > smu_context *smu, > return 0; > } > > -bool vangogh_clk_dpm_is_enabled(struct smu_context *smu, > +static bool vangogh_clk_dpm_is_enabled(struct smu_context *smu, > enum smu_clk_type clk_type) Ah, I have another patch which will use this function in another file. Thanks, Ray > { > enum smu_feature_mask feature_id = 0; > -- > 1.9.1 >
Re: 5.11-rc1 TTM list corruption
On Tue, Jan 05, 2021 at 07:43:51PM +0800, Borislav Petkov wrote: > On Tue, Jan 05, 2021 at 07:08:52PM +0800, Huang Rui wrote: > > Ah, this asic is a bit old and still use radeon driver. So we didn't > > reproduce it on amdgpu driver. I don't have such the old asic in my hand. > > May we know whether this issue can be duplicated after SI which is used > > amdgpu module (not sure whether you have recent APU or GPU)? > > The latest I have (I think it is the latest) is: > > [1.826102] [drm] initializing kernel modesetting (RENOIR 0x1002:0x1636 > 0x17AA:0x5099 0xD1). > > and so far that hasn't triggered it. Which makes sense because that > thing uses amdgpu: > > [1.810260] [drm] amdgpu kernel modesetting enabled. Yes! Renoir is late enough for amdgpu kernel module. :-) Please let us know if you still encounter the issue. Thanks, Ray
Re: 5.11-rc1 TTM list corruption
On Tue, Jan 05, 2021 at 06:31:38PM +0800, Borislav Petkov wrote: > Hi, > > On Tue, Jan 05, 2021 at 12:12:13PM +0800, Huang Rui wrote: > > I am reproducing this issue as well, are you using a Raven board? > > I have no clue what Raven is. The workstation I triggered it once on, has: > > [7.563968] [drm] radeon kernel modesetting enabled. > [7.581417] [drm] initializing kernel modesetting (CAICOS 0x1002:0x6779 > 0x174B:0xE164 0x00). Ah, this asic is a bit old and still use radeon driver. So we didn't reproduce it on amdgpu driver. I don't have such the old asic in my hand. May we know whether this issue can be duplicated after SI which is used amdgpu module (not sure whether you have recent APU or GPU)? Thanks, Ray > [7.609217] [drm] Detected VRAM RAM=2048M, BAR=256M > [7.614031] [drm] RAM width 64bits DDR > [7.639665] [drm] radeon: 2048M of VRAM memory ready > [7.644557] [drm] radeon: 1024M of GTT memory ready. > [7.649451] [drm] Loading CAICOS Microcode > [7.653548] [drm] Internal thermal controller without fan control > [7.661221] [drm] radeon: dpm initialized > [7.665227] [drm] GART: num cpu pages 262144, num gpu pages 262144 > [7.671821] [drm] enabling PCIE gen 2 link speeds, disable with > radeon.pcie_gen2=0 > [7.703858] [drm] PCIE GART of 1024M enabled (table at 0x00162000). > [7.749689] [drm] radeon: irq initialized. > [7.769826] [drm] ring test on 0 succeeded in 1 usecs > [7.774797] [drm] ring test on 3 succeeded in 3 usecs > [7.955500] [drm] ring test on 5 succeeded in 1 usecs > [7.960468] [drm] UVD initialized successfully. > [7.965047] [drm] ib test on ring 0 succeeded in 0 usecs > [7.970316] [drm] ib test on ring 3 succeeded in 0 usecs > [8.626877] [drm] ib test on ring 5 succeeded > [8.631376] [drm] Radeon Display Connectors > [8.635496] [drm] Connector 0: > [8.638503] [drm] HDMI-A-1 > [8.641339] [drm] HPD2 > [8.643835] [drm] DDC: 0x6440 0x6440 0x6444 0x6444 0x6448 0x6448 0x644c > 0x644c > [8.651102] [drm] Encoders: > [8.654022] [drm] DFP1: INTERNAL_UNIPHY1 > [8.658224] [drm] Connector 1: > [8.661232] [drm] DVI-D-1 > [8.663982] [drm] HPD4 > [8.666479] [drm] DDC: 0x6460 0x6460 0x6464 0x6464 0x6468 0x6468 0x646c > 0x646c > [8.673745] [drm] Encoders: > [8.676665] [drm] DFP2: INTERNAL_UNIPHY > [8.680782] [drm] Connector 2: > [8.683789] [drm] VGA-1 > [8.686369] [drm] DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c > 0x643c > [8.693636] [drm] Encoders: > [8.696555] [drm] CRT1: INTERNAL_KLDSCP_DAC1 > [8.788923] [drm] fb mappable at 0xE0363000 > [8.793036] [drm] vram apper at 0xE000 > [8.797064] [drm] size 9216000 > [8.800071] [drm] fb depth is 24 > [8.803249] [drm]pitch is 7680 > [8.807106] fbcon: radeondrmfb (fb0) is primary device > [8.918927] radeon :1d:00.0: [drm] fb0: radeondrmfb frame buffer device > [8.938461] [drm] Initialized radeon 2.50.0 20080528 for :1d:00.0 on > minor 0 > > HTH. > > -- > Regards/Gruss, > Boris. > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquettedata=04%7C01%7Cray.huang%40amd.com%7C31b8dcd4040e4a49380e08d8b16517ad%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637454395066317813%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=al4lLGA%2BCdHK4HzO8M5VJthY8Iv71xQ0TsDGwJpgs1A%3Dreserved=0
Re: 5.11-rc1 TTM list corruption
On Mon, Jan 04, 2021 at 06:58:02PM +0800, Borislav Petkov wrote: > On Fri, Jan 01, 2021 at 03:34:28PM +0100, Christian K?nig wrote: > > Going to double check the code, but can you reproduce this issue > > reliable? > > Lemme find a test box which can trigger it too - the splat happened > on my workstation and I'd like to avoid debugging there for obvious > reasons. Hi Boris, Christian, I am reproducing this issue as well, are you using a Raven board? Thanks, Ray > > Thx. > > -- > Regards/Gruss, > Boris. > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquettedata=04%7C01%7Cray.huang%40amd.com%7C33b48c914b5b4672ac7308d8b09f9a03%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637453546869304657%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=e3Fj4KGz5n0D9O0zGApDTfstJpNmeu6HSJN2oa8iSKA%3Dreserved=0
Re: [PATCH] PCI: Mark AMD Raven iGPU ATS as broken
On Tue, Nov 24, 2020 at 06:51:11AM +0800, Kuehling, Felix wrote: > On 2020-11-23 5:33 p.m., Will Deacon wrote: > > On Mon, Nov 23, 2020 at 09:04:14PM +, Deucher, Alexander wrote: > >> [AMD Public Use] > >> > >>> -Original Message- > >>> From: Will Deacon > >>> Sent: Monday, November 23, 2020 8:44 AM > >>> To: linux-kernel@vger.kernel.org > >>> Cc: linux-...@vger.kernel.org; io...@lists.linux-foundation.org; Will > >>> Deacon ; Bjorn Helgaas ; > >>> Deucher, Alexander ; Edgar Merger > >>> ; Joerg Roedel > >>> Subject: [PATCH] PCI: Mark AMD Raven iGPU ATS as broken > >>> > >>> Edgar Merger reports that the AMD Raven GPU does not work reliably on his > >>> system when the IOMMU is enabled: > >>> > >>>| [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, > >>> signaled seq=1, emitted seq=3 > >>>| [...] > >>>| amdgpu :0b:00.0: GPU reset begin! > >>>| AMD-Vi: Completion-Wait loop timed out > >>>| iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT > >>> device=0b:00.0 address=0x38edc0970] > >>> > >>> This is indicative of a hardware/platform configuration issue so, since > >>> disabling ATS has been shown to resolve the problem, add a quirk to match > >>> this particular device while Edgar follows-up with AMD for more > >>> information. > >>> > >>> Cc: Bjorn Helgaas > >>> Cc: Alex Deucher > >>> Reported-by: Edgar Merger > >>> Suggested-by: Joerg Roedel > >>> Link: > >>> https://lore. > >>> kernel.org/linux- > >>> iommu/MWHPR10MB1310F042A30661D4158520B589FC0@MWHPR10M > >>> B1310.namprd10.prod.outlook.com > >>> her%40amd.com%7C1a883fe14d0c408e7d9508d88fb5df4e%7C3dd8961fe488 > >>> 4e608e11a82d994e183d%7C0%7C0%7C637417358593629699%7CUnknown%7 > >>> CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwi > >>> LCJXVCI6Mn0%3D%7C1000sdata=TMgKldWzsX8XZ0l7q3%2BszDWXQJJ > >>> LOUfX5oGaoLN8n%2B8%3Dreserved=0 > >>> Signed-off-by: Will Deacon > >>> --- > >>> > >>> Hi all, > >>> > >>> Since Joerg is away at the moment, I'm posting this to try to make some > >>> progress with the thread in the Link: tag. > >> + Felix > >> > >> What system is this? Can you provide more details? Does a sbios update > >> fix this? Disabling ATS for all Ravens will break GPU compute for a lot > >> of people. I'd prefer to just black list this particular system (e.g., > >> just SSIDs or revision) if possible. > > +Ray > > There are already many systems where the IOMMU is disabled in the BIOS, > or the CRAT table reporting the APU compute capabilities is broken. Ray > has been working on a fallback to make APUs behave like dGPUs on such > systems. That should also cover this case where ATS is blacklisted. That > said, it affects the programming model, because we don't support the > unified and coherent memory model on dGPUs like we do on APUs with > IOMMUv2. So it would be good to make the conditions for this workaround > as narrow as possible. Yes, besides the comments from Alex and Felix, may we get your firmware version (SMC firmware which is from SBIOS) and device id? > >>>| [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, > >>> signaled seq=1, emitted seq=3 It looks only gfx ib test passed, and fails to lanuch desktop, am I right? We would like to see whether it is Raven, Raven kicker (new Raven), or Picasso. In our side, per the internal test result, we didn't see the similiar issue on Raven kicker and Picasso platform. Thanks, Ray > > These are the relevant changes in KFD and Thunk for reference: > > ### KFD ### > > commit 914913ab04dfbcd0226ecb6bc99d276832ea2908 > Author: Huang Rui > Date: Tue Aug 18 14:54:23 2020 +0800 > > drm/amdkfd: implement the dGPU fallback path for apu (v6) > > We still have a few iommu issues which need to address, so force raven > as "dgpu" path for the moment. > > This is to add the fallback path to bypass IOMMU if IOMMU v2 is > disabled > or ACPI CRAT table not correct. > > v2: Use ignore_crat parameter to decide whether it will go with > IOMMUv2. > v3: Align with existed thunk, don't change the way of raven, only > renoir > will use "dgpu" path by default. > v4: don't u
Re: [PATCH v2 -next] drm/amdkfd: Fix -Wunused-const-variable warning
On Thu, Sep 10, 2020 at 03:50:06PM +0800, YueHaibing wrote: > If KFD_SUPPORT_IOMMU_V2 is not set, gcc warns: > > drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device.c:121:37: warning: > ‘raven_device_info’ defined but not used [-Wunused-const-variable=] > static const struct kfd_device_info raven_device_info = { > ^~~~~~~~~ > > As Huang Rui suggested, Raven already has the fallback path, > so it should be out of IOMMU v2 flag. > > Suggested-by: Huang Rui > Signed-off-by: YueHaibing Acked-by: Huang Rui > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 0e71a0543f98..e3fc6ed7b79c 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -503,8 +503,8 @@ static const struct kfd_device_info > *kfd_supported_devices[][2] = { > #ifdef KFD_SUPPORT_IOMMU_V2 > [CHIP_KAVERI] = {_device_info, NULL}, > [CHIP_CARRIZO] = {_device_info, NULL}, > - [CHIP_RAVEN] = {_device_info, NULL}, > #endif > + [CHIP_RAVEN] = {_device_info, NULL}, > [CHIP_HAWAII] = {_device_info, NULL}, > [CHIP_TONGA] = {_device_info, NULL}, > [CHIP_FIJI] = {_device_info, _vf_device_info}, > -- > 2.17.1 > >
Re: [PATCH -next] drm/amdkfd: Fix -Wunused-const-variable warning
On Thu, Sep 10, 2020 at 10:55:32AM +0800, YueHaibing wrote: > If KFD_SUPPORT_IOMMU_V2 is not set, gcc warns: > > drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device.c:121:37: warning: > ‘raven_device_info’ defined but not used [-Wunused-const-variable=] > static const struct kfd_device_info raven_device_info = { > ^ > > Move it to ifdef block. > > Signed-off-by: YueHaibing > --- Raven already has the fallback path, so it should be out of IOMMU v2 flag. You may want to move raven_device_info out of IOMMU v2 flag in kfd_supported_devices[][2] as well. Thanks, Ray > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 0e71a0543f98..cae4df259e26 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -116,7 +116,6 @@ static const struct kfd_device_info carrizo_device_info = > { > .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, > }; > -#endif > > static const struct kfd_device_info raven_device_info = { > .asic_family = CHIP_RAVEN, > @@ -135,6 +134,7 @@ static const struct kfd_device_info raven_device_info = { > .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, > }; > +#endif > > static const struct kfd_device_info hawaii_device_info = { > .asic_family = CHIP_HAWAII, > -- > 2.17.1 > > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-develdata=02%7C01%7Cray.huang%40amd.com%7Ce8805fd43e9a4ad33cd008d8555a567a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637353193795034052sdata=HqUGjUpQtfCkZ3wA1%2F6HTCZrn%2F4%2BuQORTobzaIYa%2BNs%3Dreserved=0
Re: [PATCH] drm/amdgpu/powerplay: Fix missing break in switch statement
On Fri, Mar 01, 2019 at 03:29:43PM -0600, Gustavo A. R. Silva wrote: > Add missing break statement in order to prevent the code from falling > through to case SMU_Discrete_DpmTable. > > This bug was found thanks to the ongoing efforts to enable > -Wimplicit-fallthrough. > > Fixes: 34a564eaf528 ("drm/amd/powerplay: implement fw image related smum > interface for Polaris.") > Cc: sta...@vger.kernel.org > Signed-off-by: Gustavo A. R. Silva Reviewed-by: Huang Rui > --- > drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c > b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c > index 52abca065764..222fb79d319e 100644 > --- a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c > +++ b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c > @@ -2330,6 +2330,7 @@ static uint32_t polaris10_get_offsetof(uint32_t type, > uint32_t member) > case DRAM_LOG_BUFF_SIZE: > return offsetof(SMU74_SoftRegisters, > DRAM_LOG_BUFF_SIZE); > } > + break; > case SMU_Discrete_DpmTable: > switch (member) { > case UvdBootLevel: > -- > 2.21.0 > > ___ > amd-gfx mailing list > amd-...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000010da43f last fence id 0x00000000010da52d on ring 0)
On Tue, Jun 05, 2018 at 04:44:04PM +0200, Borislav Petkov wrote: > Hi guys, > > X just froze here ontop of 4.17-rc7+ tip/master (kernel is from last > week) with the splat at the end. > > Box is a x470 chipset with Ryzen 2700X. > > GPU gets detected as > > [7.440971] [drm] radeon kernel modesetting enabled. > [7.441220] [drm] initializing kernel modesetting (RV635 0x1002:0x9598 > 0x1043:0x01DA 0x00). > [7.441328] ATOM BIOS: 9598.10.88.0.3.AS05 > [7.441395] radeon :1d:00.0: VRAM: 512M 0x - > 0x1FFF (512M used) > [7.441464] radeon :1d:00.0: GTT: 512M 0x2000 - > 0x3FFF > [7.441531] [drm] Detected VRAM RAM=512M, BAR=256M > [7.441588] [drm] RAM width 128bits DDR > [7.441690] [TTM] Zone kernel: Available graphics memory: 16462214 kiB > [7.441751] [TTM] Zone dma32: Available graphics memory: 2097152 kiB > [7.441811] [TTM] Initializing pool allocator > [7.441868] [TTM] Initializing DMA pool allocator > [7.441934] [drm] radeon: 512M of VRAM memory ready > [7.441990] [drm] radeon: 512M of GTT memory ready. > [7.442050] [drm] Loading RV635 Microcode > [7.442865] [drm] Internal thermal controller without fan control > [7.442940] [drm] radeon: power management initialized > [7.443222] [drm] GART: num cpu pages 131072, num gpu pages 131072 > [7.443487] [drm] enabling PCIE gen 2 link speeds, disable with > radeon.pcie_gen2=0 > [7.477319] [drm] PCIE GART of 512M enabled (table at 0x00142000). > [7.477400] radeon :1d:00.0: WB enabled > [7.477455] radeon :1d:00.0: fence driver on ring 0 use gpu addr > 0x2c00 and cpu addr 0x(ptrval) > [7.477708] radeon :1d:00.0: fence driver on ring 5 use gpu addr > 0x000521d0 and cpu addr 0x(ptrval) > [7.48] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). > [7.477836] [drm] Driver supports precise vblank timestamp query. > [7.477896] radeon :1d:00.0: radeon: MSI limited to 32-bit > [7.477990] radeon :1d:00.0: radeon: using MSI. > [7.478062] [drm] radeon: irq initialized. > [7.509056] [drm] ring test on 0 succeeded in 0 usecs > [7.683793] [drm] ring test on 5 succeeded in 1 usecs > [7.683853] [drm] UVD initialized successfully. > [7.684009] [drm] ib test on ring 0 succeeded in 0 usecs > [8.348466] [drm] ib test on ring 5 succeeded > [8.348921] [drm] Radeon Display Connectors > [8.348978] [drm] Connector 0: > [8.349031] [drm] DVI-I-1 > [8.349082] [drm] HPD1 > [8.349135] [drm] DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c > 0x7e5c > [8.349200] [drm] Encoders: > [8.349252] [drm] DFP1: INTERNAL_UNIPHY > [8.349308] [drm] CRT2: INTERNAL_KLDSCP_DAC2 > [8.349364] [drm] Connector 1: > [8.349416] [drm] DIN-1 > [8.349467] [drm] Encoders: > [8.349520] [drm] TV1: INTERNAL_KLDSCP_DAC2 > [8.349576] [drm] Connector 2: > [8.349628] [drm] DVI-I-2 > [8.349680] [drm] HPD2 > [8.349732] [drm] DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c > 0x7e4c > [8.349797] [drm] Encoders: > [8.349849] [drm] CRT1: INTERNAL_KLDSCP_DAC1 > [8.349905] [drm] DFP2: INTERNAL_KLDSCP_LVTMA > [8.430521] [drm] fb mappable at 0xE0243000 > [8.430575] [drm] vram apper at 0xE000 > [8.431194] [drm] size 9216000 > [8.431245] [drm] fb depth is 24 > [8.431295] [drm]pitch is 7680 > [8.431406] fbcon: radeondrmfb (fb0) is primary device > [8.496928] Console: switching to colour frame buffer device 240x75 > [8.501851] radeon :1d:00.0: fb0: radeondrmfb frame buffer device > [8.520179] [drm] Initialized radeon 2.50.0 20080528 for :1d:00.0 on > minor 0 > > in the PCIe slot with two monitors connected to it. radeon firmware is > > Version: 20170823-1 > > What practically happened is X froze and got restarted after the GPU > reset. It seems to be ok now, as I'm typing in it. > > Thoughts? > > [197439.022249] Restarting tasks ... done. > [197439.024043] PM: hibernation exit > [197439.058296] r8169 :18:00.0 eth0: link up > [200941.240184] perf: interrupt took too long (2507 > 2500), lowering > kernel.perf_event_max_sample_rate to 79750 > [221973.686894] radeon :1d:00.0: ring 0 stalled for more than 10176msec > [221973.686900] radeon :1d:00.0: GPU lockup (current fence id > 0x010da43f last fence id 0x010da52d on ring 0) > [221973.686929] radeon :1d:00.0: failed to get a new IB (-35) > [221973.686950] [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to get ib ! > [221973.693971] radeon :1d:00.0: Saved 7609 dwords of commands on ring 0. > [221973.693985] radeon :1d:00.0: GPU softreset: 0x0008 > [221973.693988] radeon :1d:00.0: R_008010_GRBM_STATUS = 0xA0001030 > [221973.693990] radeon :1d:00.0: R_008014_GRBM_STATUS2 = 0x0003 >
Re: radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000010da43f last fence id 0x00000000010da52d on ring 0)
On Tue, Jun 05, 2018 at 04:44:04PM +0200, Borislav Petkov wrote: > Hi guys, > > X just froze here ontop of 4.17-rc7+ tip/master (kernel is from last > week) with the splat at the end. > > Box is a x470 chipset with Ryzen 2700X. > > GPU gets detected as > > [7.440971] [drm] radeon kernel modesetting enabled. > [7.441220] [drm] initializing kernel modesetting (RV635 0x1002:0x9598 > 0x1043:0x01DA 0x00). > [7.441328] ATOM BIOS: 9598.10.88.0.3.AS05 > [7.441395] radeon :1d:00.0: VRAM: 512M 0x - > 0x1FFF (512M used) > [7.441464] radeon :1d:00.0: GTT: 512M 0x2000 - > 0x3FFF > [7.441531] [drm] Detected VRAM RAM=512M, BAR=256M > [7.441588] [drm] RAM width 128bits DDR > [7.441690] [TTM] Zone kernel: Available graphics memory: 16462214 kiB > [7.441751] [TTM] Zone dma32: Available graphics memory: 2097152 kiB > [7.441811] [TTM] Initializing pool allocator > [7.441868] [TTM] Initializing DMA pool allocator > [7.441934] [drm] radeon: 512M of VRAM memory ready > [7.441990] [drm] radeon: 512M of GTT memory ready. > [7.442050] [drm] Loading RV635 Microcode > [7.442865] [drm] Internal thermal controller without fan control > [7.442940] [drm] radeon: power management initialized > [7.443222] [drm] GART: num cpu pages 131072, num gpu pages 131072 > [7.443487] [drm] enabling PCIE gen 2 link speeds, disable with > radeon.pcie_gen2=0 > [7.477319] [drm] PCIE GART of 512M enabled (table at 0x00142000). > [7.477400] radeon :1d:00.0: WB enabled > [7.477455] radeon :1d:00.0: fence driver on ring 0 use gpu addr > 0x2c00 and cpu addr 0x(ptrval) > [7.477708] radeon :1d:00.0: fence driver on ring 5 use gpu addr > 0x000521d0 and cpu addr 0x(ptrval) > [7.48] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). > [7.477836] [drm] Driver supports precise vblank timestamp query. > [7.477896] radeon :1d:00.0: radeon: MSI limited to 32-bit > [7.477990] radeon :1d:00.0: radeon: using MSI. > [7.478062] [drm] radeon: irq initialized. > [7.509056] [drm] ring test on 0 succeeded in 0 usecs > [7.683793] [drm] ring test on 5 succeeded in 1 usecs > [7.683853] [drm] UVD initialized successfully. > [7.684009] [drm] ib test on ring 0 succeeded in 0 usecs > [8.348466] [drm] ib test on ring 5 succeeded > [8.348921] [drm] Radeon Display Connectors > [8.348978] [drm] Connector 0: > [8.349031] [drm] DVI-I-1 > [8.349082] [drm] HPD1 > [8.349135] [drm] DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c > 0x7e5c > [8.349200] [drm] Encoders: > [8.349252] [drm] DFP1: INTERNAL_UNIPHY > [8.349308] [drm] CRT2: INTERNAL_KLDSCP_DAC2 > [8.349364] [drm] Connector 1: > [8.349416] [drm] DIN-1 > [8.349467] [drm] Encoders: > [8.349520] [drm] TV1: INTERNAL_KLDSCP_DAC2 > [8.349576] [drm] Connector 2: > [8.349628] [drm] DVI-I-2 > [8.349680] [drm] HPD2 > [8.349732] [drm] DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c > 0x7e4c > [8.349797] [drm] Encoders: > [8.349849] [drm] CRT1: INTERNAL_KLDSCP_DAC1 > [8.349905] [drm] DFP2: INTERNAL_KLDSCP_LVTMA > [8.430521] [drm] fb mappable at 0xE0243000 > [8.430575] [drm] vram apper at 0xE000 > [8.431194] [drm] size 9216000 > [8.431245] [drm] fb depth is 24 > [8.431295] [drm]pitch is 7680 > [8.431406] fbcon: radeondrmfb (fb0) is primary device > [8.496928] Console: switching to colour frame buffer device 240x75 > [8.501851] radeon :1d:00.0: fb0: radeondrmfb frame buffer device > [8.520179] [drm] Initialized radeon 2.50.0 20080528 for :1d:00.0 on > minor 0 > > in the PCIe slot with two monitors connected to it. radeon firmware is > > Version: 20170823-1 > > What practically happened is X froze and got restarted after the GPU > reset. It seems to be ok now, as I'm typing in it. > > Thoughts? > > [197439.022249] Restarting tasks ... done. > [197439.024043] PM: hibernation exit > [197439.058296] r8169 :18:00.0 eth0: link up > [200941.240184] perf: interrupt took too long (2507 > 2500), lowering > kernel.perf_event_max_sample_rate to 79750 > [221973.686894] radeon :1d:00.0: ring 0 stalled for more than 10176msec > [221973.686900] radeon :1d:00.0: GPU lockup (current fence id > 0x010da43f last fence id 0x010da52d on ring 0) > [221973.686929] radeon :1d:00.0: failed to get a new IB (-35) > [221973.686950] [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to get ib ! > [221973.693971] radeon :1d:00.0: Saved 7609 dwords of commands on ring 0. > [221973.693985] radeon :1d:00.0: GPU softreset: 0x0008 > [221973.693988] radeon :1d:00.0: R_008010_GRBM_STATUS = 0xA0001030 > [221973.693990] radeon :1d:00.0: R_008014_GRBM_STATUS2 = 0x0003 >
Re: [PATCH] drm/radeon: change function signature to pass full range
On Thu, Apr 12, 2018 at 09:33:33PM +0200, Mathieu Malaterre wrote: > In function ‘radeon_process_i2c_ch’ a comparison of a u8 value against > 255 is done. Since it is always false, change the signature of this > function to use an `int` instead, which match the type used in caller: > `radeon_atom_hw_i2c_xfer`. > > Fix the following warning triggered with W=1: > > CC [M] drivers/gpu/drm/radeon/atombios_i2c.o > drivers/gpu/drm/radeon/atombios_i2c.c: In function ‘radeon_process_i2c_ch’: > drivers/gpu/drm/radeon/atombios_i2c.c:71:11: warning: comparison is always > false due to limited range of data type [-Wtype-limits] >if (num > ATOM_MAX_HW_I2C_READ) { >^ > > Signed-off-by: Mathieu Malaterre <ma...@debian.org> Reviewed-by: Huang Rui <ray.hu...@amd.com> > --- > drivers/gpu/drm/radeon/atombios_i2c.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/radeon/atombios_i2c.c > b/drivers/gpu/drm/radeon/atombios_i2c.c > index 4157780585a0..9022e9af11a0 100644 > --- a/drivers/gpu/drm/radeon/atombios_i2c.c > +++ b/drivers/gpu/drm/radeon/atombios_i2c.c > @@ -35,7 +35,7 @@ > > static int radeon_process_i2c_ch(struct radeon_i2c_chan *chan, >u8 slave_addr, u8 flags, > - u8 *buf, u8 num) > + u8 *buf, int num) > { > struct drm_device *dev = chan->dev; > struct radeon_device *rdev = dev->dev_private; > -- > 2.11.0 > > ___ > amd-gfx mailing list > amd-...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/radeon: change function signature to pass full range
On Thu, Apr 12, 2018 at 09:33:33PM +0200, Mathieu Malaterre wrote: > In function ‘radeon_process_i2c_ch’ a comparison of a u8 value against > 255 is done. Since it is always false, change the signature of this > function to use an `int` instead, which match the type used in caller: > `radeon_atom_hw_i2c_xfer`. > > Fix the following warning triggered with W=1: > > CC [M] drivers/gpu/drm/radeon/atombios_i2c.o > drivers/gpu/drm/radeon/atombios_i2c.c: In function ‘radeon_process_i2c_ch’: > drivers/gpu/drm/radeon/atombios_i2c.c:71:11: warning: comparison is always > false due to limited range of data type [-Wtype-limits] >if (num > ATOM_MAX_HW_I2C_READ) { >^ > > Signed-off-by: Mathieu Malaterre Reviewed-by: Huang Rui > --- > drivers/gpu/drm/radeon/atombios_i2c.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/radeon/atombios_i2c.c > b/drivers/gpu/drm/radeon/atombios_i2c.c > index 4157780585a0..9022e9af11a0 100644 > --- a/drivers/gpu/drm/radeon/atombios_i2c.c > +++ b/drivers/gpu/drm/radeon/atombios_i2c.c > @@ -35,7 +35,7 @@ > > static int radeon_process_i2c_ch(struct radeon_i2c_chan *chan, >u8 slave_addr, u8 flags, > - u8 *buf, u8 num) > + u8 *buf, int num) > { > struct drm_device *dev = chan->dev; > struct radeon_device *rdev = dev->dev_private; > -- > 2.11.0 > > ___ > amd-gfx mailing list > amd-...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH V2] drm/amdgpu: limit DMA size to PAGE_SIZE for scatter-gather buffers
On Tue, Apr 10, 2018 at 04:59:55PM -0400, Sinan Kaya wrote: > Code is expecing to observe the same number of buffers returned from > dma_map_sg() function compared to sg_alloc_table_from_pages(). This > doesn't hold true universally especially for systems with IOMMU. > > IOMMU driver tries to combine buffers into a single DMA address as much > as it can. The right thing is to tell the DMA layer how much combining > IOMMU can do. > > Signed-off-by: Sinan KayaSinan, I guess Christian's suggestion is to add amdgpu_device_init function like here: 8<--- diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 0e798b3..9b96771 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2339,6 +2339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev, /* init the mode config */ drm_mode_config_init(adev->ddev); + dma_set_max_seg_size(adev->dev, PAGE_SIZE); + r = amdgpu_device_ip_init(adev); if (r) { /* failed in exclusive mode due to timeout */ 8<--- After that, we don't do it in each generation. Thanks, Ray
Re: [PATCH V2] drm/amdgpu: limit DMA size to PAGE_SIZE for scatter-gather buffers
On Tue, Apr 10, 2018 at 04:59:55PM -0400, Sinan Kaya wrote: > Code is expecing to observe the same number of buffers returned from > dma_map_sg() function compared to sg_alloc_table_from_pages(). This > doesn't hold true universally especially for systems with IOMMU. > > IOMMU driver tries to combine buffers into a single DMA address as much > as it can. The right thing is to tell the DMA layer how much combining > IOMMU can do. > > Signed-off-by: Sinan Kaya Sinan, I guess Christian's suggestion is to add amdgpu_device_init function like here: 8<--- diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 0e798b3..9b96771 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2339,6 +2339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev, /* init the mode config */ drm_mode_config_init(adev->ddev); + dma_set_max_seg_size(adev->dev, PAGE_SIZE); + r = amdgpu_device_ip_init(adev); if (r) { /* failed in exclusive mode due to timeout */ 8<--- After that, we don't do it in each generation. Thanks, Ray
Re: perf/x86/amd/power missing pmu::module initialisation
On Tue, Nov 07, 2017 at 08:50:52PM +0800, Huang Rui wrote: > Hi Mark, > > On Fri, Nov 03, 2017 at 11:33:56AM +, Mark Rutland wrote: > > Hi, > > > > As a heads-up, I believe that arch/x86/events/amd/power.c is currently > > missing initialisation of pmu::module, which could result in problems > > when the module is unloaded. > > > > We've just spotted a similar problem in a couple of ARM PMUs, and AFAICT > > the AMD power PMU is the only other PMU affected. > > > > Other modular PMUs all initialise this to THIS_MODULE prior to calling > > perf_pmu_register(). > > > > Thanks to let me know. I have dived into pmu for quite a while. Could you Fix typo -> "I haven't dived into ..." Thanks, Rui
Re: perf/x86/amd/power missing pmu::module initialisation
On Tue, Nov 07, 2017 at 08:50:52PM +0800, Huang Rui wrote: > Hi Mark, > > On Fri, Nov 03, 2017 at 11:33:56AM +, Mark Rutland wrote: > > Hi, > > > > As a heads-up, I believe that arch/x86/events/amd/power.c is currently > > missing initialisation of pmu::module, which could result in problems > > when the module is unloaded. > > > > We've just spotted a similar problem in a couple of ARM PMUs, and AFAICT > > the AMD power PMU is the only other PMU affected. > > > > Other modular PMUs all initialise this to THIS_MODULE prior to calling > > perf_pmu_register(). > > > > Thanks to let me know. I have dived into pmu for quite a while. Could you Fix typo -> "I haven't dived into ..." Thanks, Rui
Re: perf/x86/amd/power missing pmu::module initialisation
Hi Mark, On Fri, Nov 03, 2017 at 11:33:56AM +, Mark Rutland wrote: > Hi, > > As a heads-up, I believe that arch/x86/events/amd/power.c is currently > missing initialisation of pmu::module, which could result in problems > when the module is unloaded. > > We've just spotted a similar problem in a couple of ARM PMUs, and AFAICT > the AMD power PMU is the only other PMU affected. > > Other modular PMUs all initialise this to THIS_MODULE prior to calling > perf_pmu_register(). > Thanks to let me know. I have dived into pmu for quite a while. Could you please share more information or forward related mail thread, so that I can repro this issue in my side. :-) Thanks, Rui
Re: perf/x86/amd/power missing pmu::module initialisation
Hi Mark, On Fri, Nov 03, 2017 at 11:33:56AM +, Mark Rutland wrote: > Hi, > > As a heads-up, I believe that arch/x86/events/amd/power.c is currently > missing initialisation of pmu::module, which could result in problems > when the module is unloaded. > > We've just spotted a similar problem in a couple of ARM PMUs, and AFAICT > the AMD power PMU is the only other PMU affected. > > Other modular PMUs all initialise this to THIS_MODULE prior to calling > perf_pmu_register(). > Thanks to let me know. I have dived into pmu for quite a while. Could you please share more information or forward related mail thread, so that I can repro this issue in my side. :-) Thanks, Rui
Re: [PATCH 65/66] hwmon: (fam15h_power) use permission-specific DEVICE_ATTR variants
On Thu, Dec 22, 2016 at 01:05:34PM +0100, Julia Lawall wrote: > Use DEVICE_ATTR_RO etc. for read only attributes etc. This simplifies the > source code, improves readbility, and reduces the chance of > inconsistencies. > > The semantic patch for the RO case, in the case where the show function > already has the expected name, is as follows: > (http://coccinelle.lip6.fr/) > > // > @ro@ > declarer name DEVICE_ATTR; > identifier x,x_show; > @@ > > DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL); > > @script:ocaml@ > x << ro.x; > x_show << ro.x_show; > @@ > > if not (x^"_show" = x_show) then Coccilib.include_match false > > @@ > declarer name DEVICE_ATTR_RO; > identifier ro.x,ro.x_show; > @@ > > - DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL); > + DEVICE_ATTR_RO(x); > // > > Signed-off-by: Julia Lawall <julia.law...@lip6.fr> > Acked-by: Huang Rui <ray.hu...@amd.com> Thanks, Rui > --- > drivers/hwmon/fam15h_power.c | 34 -- > 1 file changed, 16 insertions(+), 18 deletions(-) > > diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c > index 15aa49d..9545a34 100644 > --- a/drivers/hwmon/fam15h_power.c > +++ b/drivers/hwmon/fam15h_power.c > @@ -83,8 +83,8 @@ static bool is_carrizo_or_later(void) > return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60; > } > > -static ssize_t show_power(struct device *dev, > - struct device_attribute *attr, char *buf) > +static ssize_t power1_input_show(struct device *dev, > + struct device_attribute *attr, char *buf) > { > u32 val, tdp_limit, running_avg_range; > s32 running_avg_capture; > @@ -136,16 +136,16 @@ static ssize_t show_power(struct device *dev, > curr_pwr_watts = (curr_pwr_watts * 15625) >> (10 + running_avg_range); > return sprintf(buf, "%u\n", (unsigned int) curr_pwr_watts); > } > -static DEVICE_ATTR(power1_input, S_IRUGO, show_power, NULL); > +static DEVICE_ATTR_RO(power1_input); > > -static ssize_t show_power_crit(struct device *dev, > -struct device_attribute *attr, char *buf) > +static ssize_t power1_crit_show(struct device *dev, > + struct device_attribute *attr, char *buf) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > > return sprintf(buf, "%u\n", data->processor_pwr_watts); > } > -static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL); > +static DEVICE_ATTR_RO(power1_crit); > > static void do_read_registers_on_cu(void *_data) > { > @@ -212,9 +212,8 @@ static int read_registers(struct fam15h_power_data *data) > return 0; > } > > -static ssize_t acc_show_power(struct device *dev, > - struct device_attribute *attr, > - char *buf) > +static ssize_t power1_average_show(struct device *dev, > +struct device_attribute *attr, char *buf) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS], > @@ -267,20 +266,20 @@ static ssize_t acc_show_power(struct device *dev, > > return sprintf(buf, "%llu\n", (unsigned long long)avg_acc); > } > -static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL); > +static DEVICE_ATTR_RO(power1_average); > > -static ssize_t acc_show_power_period(struct device *dev, > - struct device_attribute *attr, > - char *buf) > +static ssize_t power1_average_interval_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > > return sprintf(buf, "%lu\n", data->power_period); > } > > -static ssize_t acc_set_power_period(struct device *dev, > - struct device_attribute *attr, > - const char *buf, size_t count) > +static ssize_t power1_average_interval_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > unsigned long temp; > @@ -301,8 +300,7 @@ static ssize_t acc_set_power_period(struct device *dev, > > return count; > } > -static DEVICE_ATTR(power1_average_interval, S_IRUGO | S_IWUSR, > -acc_show_power_period, acc_set_power_period); > +static DEVICE_ATTR_RW(power1_average_interval); > > static int fam15h_power_init_attrs(struct pci_dev *pdev, > struct fam15h_power_data *data) >
Re: [PATCH 65/66] hwmon: (fam15h_power) use permission-specific DEVICE_ATTR variants
On Thu, Dec 22, 2016 at 01:05:34PM +0100, Julia Lawall wrote: > Use DEVICE_ATTR_RO etc. for read only attributes etc. This simplifies the > source code, improves readbility, and reduces the chance of > inconsistencies. > > The semantic patch for the RO case, in the case where the show function > already has the expected name, is as follows: > (http://coccinelle.lip6.fr/) > > // > @ro@ > declarer name DEVICE_ATTR; > identifier x,x_show; > @@ > > DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL); > > @script:ocaml@ > x << ro.x; > x_show << ro.x_show; > @@ > > if not (x^"_show" = x_show) then Coccilib.include_match false > > @@ > declarer name DEVICE_ATTR_RO; > identifier ro.x,ro.x_show; > @@ > > - DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL); > + DEVICE_ATTR_RO(x); > // > > Signed-off-by: Julia Lawall > Acked-by: Huang Rui Thanks, Rui > --- > drivers/hwmon/fam15h_power.c | 34 -- > 1 file changed, 16 insertions(+), 18 deletions(-) > > diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c > index 15aa49d..9545a34 100644 > --- a/drivers/hwmon/fam15h_power.c > +++ b/drivers/hwmon/fam15h_power.c > @@ -83,8 +83,8 @@ static bool is_carrizo_or_later(void) > return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60; > } > > -static ssize_t show_power(struct device *dev, > - struct device_attribute *attr, char *buf) > +static ssize_t power1_input_show(struct device *dev, > + struct device_attribute *attr, char *buf) > { > u32 val, tdp_limit, running_avg_range; > s32 running_avg_capture; > @@ -136,16 +136,16 @@ static ssize_t show_power(struct device *dev, > curr_pwr_watts = (curr_pwr_watts * 15625) >> (10 + running_avg_range); > return sprintf(buf, "%u\n", (unsigned int) curr_pwr_watts); > } > -static DEVICE_ATTR(power1_input, S_IRUGO, show_power, NULL); > +static DEVICE_ATTR_RO(power1_input); > > -static ssize_t show_power_crit(struct device *dev, > -struct device_attribute *attr, char *buf) > +static ssize_t power1_crit_show(struct device *dev, > + struct device_attribute *attr, char *buf) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > > return sprintf(buf, "%u\n", data->processor_pwr_watts); > } > -static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL); > +static DEVICE_ATTR_RO(power1_crit); > > static void do_read_registers_on_cu(void *_data) > { > @@ -212,9 +212,8 @@ static int read_registers(struct fam15h_power_data *data) > return 0; > } > > -static ssize_t acc_show_power(struct device *dev, > - struct device_attribute *attr, > - char *buf) > +static ssize_t power1_average_show(struct device *dev, > +struct device_attribute *attr, char *buf) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS], > @@ -267,20 +266,20 @@ static ssize_t acc_show_power(struct device *dev, > > return sprintf(buf, "%llu\n", (unsigned long long)avg_acc); > } > -static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL); > +static DEVICE_ATTR_RO(power1_average); > > -static ssize_t acc_show_power_period(struct device *dev, > - struct device_attribute *attr, > - char *buf) > +static ssize_t power1_average_interval_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > > return sprintf(buf, "%lu\n", data->power_period); > } > > -static ssize_t acc_set_power_period(struct device *dev, > - struct device_attribute *attr, > - const char *buf, size_t count) > +static ssize_t power1_average_interval_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > { > struct fam15h_power_data *data = dev_get_drvdata(dev); > unsigned long temp; > @@ -301,8 +300,7 @@ static ssize_t acc_set_power_period(struct device *dev, > > return count; > } > -static DEVICE_ATTR(power1_average_interval, S_IRUGO | S_IWUSR, > -acc_show_power_period, acc_set_power_period); > +static DEVICE_ATTR_RW(power1_average_interval); > > static int fam15h_power_init_attrs(struct pci_dev *pdev, > struct fam15h_power_data *data) >
Re: [PATCH] drm/amd/powerplay/smumgr: mark symbols static where possible
Hi Baoyou, On Mon, Sep 19, 2016 at 03:28:39PM +0800, Baoyou Xie wrote: > We get a few warnings when building kernel with W=1: > drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/tonga_smumgr.c:146:5: warning: > no previous prototype for 'tonga_program_jump_on_start' [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/fiji_smumgr.c:816:5: warning: > no previous prototype for 'fiji_save_vft_table' [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/polaris10_smumgr.c:743:5: > warning: no previous prototype for 'polaris10_avfs_event_mgr' > [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/iceland_smumgr.c:167:5: > warning: no previous prototype for 'iceland_program_jump_on_start' > [-Wmissing-prototypes] > > > In fact, these functions are only used in the file in which they are > declared and don't need a declaration, but can be made static. > So this patch marks these functions with 'static'. > + amd-gfx mail list. Thanks to your patch. But powerplay part is refactored recently below, some functions are already updated. So you might need rebase them. :-) https://lists.freedesktop.org/archives/amd-gfx/2016-September/002118.html Thanks, Rui
Re: [PATCH] drm/amd/powerplay/smumgr: mark symbols static where possible
Hi Baoyou, On Mon, Sep 19, 2016 at 03:28:39PM +0800, Baoyou Xie wrote: > We get a few warnings when building kernel with W=1: > drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/tonga_smumgr.c:146:5: warning: > no previous prototype for 'tonga_program_jump_on_start' [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/fiji_smumgr.c:816:5: warning: > no previous prototype for 'fiji_save_vft_table' [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/polaris10_smumgr.c:743:5: > warning: no previous prototype for 'polaris10_avfs_event_mgr' > [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/iceland_smumgr.c:167:5: > warning: no previous prototype for 'iceland_program_jump_on_start' > [-Wmissing-prototypes] > > > In fact, these functions are only used in the file in which they are > declared and don't need a declaration, but can be made static. > So this patch marks these functions with 'static'. > + amd-gfx mail list. Thanks to your patch. But powerplay part is refactored recently below, some functions are already updated. So you might need rebase them. :-) https://lists.freedesktop.org/archives/amd-gfx/2016-September/002118.html Thanks, Rui
Re: [PATCH] drm: amdgpu: mark symbols static where possible
On Sat, Sep 03, 2016 at 01:57:14PM +0800, Baoyou Xie wrote: > We get a few warnings when building kernel with W=1: > drivers/gpu/drm/amd/amdgpu/cz_smc.c:51:5: warning: no previous prototype for > 'cz_send_msg_to_smc_async' [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/cz_smc.c:143:5: warning: no previous prototype for > 'cz_write_smc_sram_dword' [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/iceland_smc.c:124:6: warning: no previous > prototype for 'iceland_start_smc' [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:3926:6: warning: no previous prototype > for 'gfx_v8_0_rlc_stop' [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:94:6: warning: no previous prototype > for 'amdgpu_job_free_cb' [-Wmissing-prototypes] > > > In fact, these functions are only used in the file in which they are > declared and don't need a declaration, but can be made static. > So this patch marks these functions with 'static'. > > Signed-off-by: Baoyou Xie <baoyou@linaro.org> Thanks. Acked-by: Huang Rui <ray.hu...@amd.com>
Re: [PATCH] drm: amdgpu: mark symbols static where possible
On Sat, Sep 03, 2016 at 01:57:14PM +0800, Baoyou Xie wrote: > We get a few warnings when building kernel with W=1: > drivers/gpu/drm/amd/amdgpu/cz_smc.c:51:5: warning: no previous prototype for > 'cz_send_msg_to_smc_async' [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/cz_smc.c:143:5: warning: no previous prototype for > 'cz_write_smc_sram_dword' [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/iceland_smc.c:124:6: warning: no previous > prototype for 'iceland_start_smc' [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:3926:6: warning: no previous prototype > for 'gfx_v8_0_rlc_stop' [-Wmissing-prototypes] > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:94:6: warning: no previous prototype > for 'amdgpu_job_free_cb' [-Wmissing-prototypes] > > > In fact, these functions are only used in the file in which they are > declared and don't need a declaration, but can be made static. > So this patch marks these functions with 'static'. > > Signed-off-by: Baoyou Xie Thanks. Acked-by: Huang Rui
Re: perf: fuzzer crashes immediately on AMD system
Hi Peter, Vince On Fri, Aug 19, 2016 at 12:01:30PM +0200, Peter Zijlstra wrote: > On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote: > > On Thu, 18 Aug 2016, Vince Weaver wrote: > > > > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it > > > falls over more or less immediately. > > > > > > This maps to variable_test_bit() > > > called by ctx = find_get_context(pmu, task, event); > > > in kernel/events/core.c:9467 > > > > > > It happens quickly enough I can probably track down the exact event that > > > causes this, if needed. > > > > I have a one line reproducer: > > > > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls > > OK, cannot reproduce on my fam15h/model1h. I'll go dig through the > various manuals to see if I can spot the fail. > > Huang could you either prod someone at AMD or do yourself, audit the AMD > perf code for all the various new models? Actually, there might be some NBPMC event changes between model 0h-fh and model 10h-1fh. Below are the documents of these two processors: http://support.amd.com/TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf http://support.amd.com/TechDocs/42300_15h_Mod_10h-1Fh_BKDG.pdf In section 3.16, it describes usage of NB Performance Counter Events. Hope it helps. :-) Thanks, Rui
Re: perf: fuzzer crashes immediately on AMD system
Hi Peter, Vince On Fri, Aug 19, 2016 at 12:01:30PM +0200, Peter Zijlstra wrote: > On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote: > > On Thu, 18 Aug 2016, Vince Weaver wrote: > > > > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it > > > falls over more or less immediately. > > > > > > This maps to variable_test_bit() > > > called by ctx = find_get_context(pmu, task, event); > > > in kernel/events/core.c:9467 > > > > > > It happens quickly enough I can probably track down the exact event that > > > causes this, if needed. > > > > I have a one line reproducer: > > > > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls > > OK, cannot reproduce on my fam15h/model1h. I'll go dig through the > various manuals to see if I can spot the fail. > > Huang could you either prod someone at AMD or do yourself, audit the AMD > perf code for all the various new models? Actually, there might be some NBPMC event changes between model 0h-fh and model 10h-1fh. Below are the documents of these two processors: http://support.amd.com/TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf http://support.amd.com/TechDocs/42300_15h_Mod_10h-1Fh_BKDG.pdf In section 3.16, it describes usage of NB Performance Counter Events. Hope it helps. :-) Thanks, Rui
Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism
On Tue, Jul 19, 2016 at 02:22:36PM -0400, Vince Weaver wrote: > On Fri, 17 Jun 2016, Huang Rui wrote: > > > On Thu, Jun 16, 2016 at 06:47:00PM +0200, Borislav Petkov wrote: > > > On Thu, Jun 16, 2016 at 01:38:14PM +0800, Huang Rui wrote: > > > > I was told this feature would be supported on fam15h 60h, 70h and > > > > later processors before. Just checked the fam16h model 30h BKDG, yes, > > > > it should be also supported. But I didn't test that platform, if you > > > > confirm it works in your side. We can enable it. > > > > > > You might want to ask around first whether F16M30's acc power machinery > > > is even usable? I.e., no errata and whatnot... > > > > > > > Yep, I already asked the designers, and was waiting for their > > feedbacks. > > Was there ever any feedback about any of the problems encountered with AMD > APM support? > Vince, apology to late response. We are drafting the erratum for this feature. Will let you know if it is public. Thanks, Rui
Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism
On Tue, Jul 19, 2016 at 02:22:36PM -0400, Vince Weaver wrote: > On Fri, 17 Jun 2016, Huang Rui wrote: > > > On Thu, Jun 16, 2016 at 06:47:00PM +0200, Borislav Petkov wrote: > > > On Thu, Jun 16, 2016 at 01:38:14PM +0800, Huang Rui wrote: > > > > I was told this feature would be supported on fam15h 60h, 70h and > > > > later processors before. Just checked the fam16h model 30h BKDG, yes, > > > > it should be also supported. But I didn't test that platform, if you > > > > confirm it works in your side. We can enable it. > > > > > > You might want to ask around first whether F16M30's acc power machinery > > > is even usable? I.e., no errata and whatnot... > > > > > > > Yep, I already asked the designers, and was waiting for their > > feedbacks. > > Was there ever any feedback about any of the problems encountered with AMD > APM support? > Vince, apology to late response. We are drafting the erratum for this feature. Will let you know if it is public. Thanks, Rui
Re: [patch] events/amd/power add support for fam16h model30h
On Thu, Jun 16, 2016 at 11:12:18PM +0200, Borislav Petkov wrote: > On Thu, Jun 16, 2016 at 05:00:04PM -0400, Vince Weaver wrote: > > > > According to the BKDG the AMD Family16h Model30h "Jaguar Mullins" > > also supports the accumulated power interface. I've tested on > > hardware I have and with this patch I indeed get power readings using > > perf. > > > > Signed-off-by: Vince Weaver> > > > diff --git a/arch/x86/events/amd/power.c b/arch/x86/events/amd/power.c > > index 55a3529..778b77d 100644 > > --- a/arch/x86/events/amd/power.c > > +++ b/arch/x86/events/amd/power.c > > @@ -292,6 +292,7 @@ static struct notifier_block power_cpu_notifier_nb = { > > > > static const struct x86_cpu_id cpu_match[] = { > > { .vendor = X86_VENDOR_AMD, .family = 0x15 }, > > + { .vendor = X86_VENDOR_AMD, .family = 0x16 }, > > {}, > > Actually, I think we remove that table completely and rely solely on the > CPUID bit: > > if (!boot_cpu_has(X86_FEATURE_ACC_POWER)) > return -ENODEV; > > Rui? > Agree with you. If the some chips are not stable, we can add a check to ignore them with family and model id. Thanks, Rui
Re: [patch] events/amd/power add support for fam16h model30h
On Thu, Jun 16, 2016 at 11:12:18PM +0200, Borislav Petkov wrote: > On Thu, Jun 16, 2016 at 05:00:04PM -0400, Vince Weaver wrote: > > > > According to the BKDG the AMD Family16h Model30h "Jaguar Mullins" > > also supports the accumulated power interface. I've tested on > > hardware I have and with this patch I indeed get power readings using > > perf. > > > > Signed-off-by: Vince Weaver > > > > diff --git a/arch/x86/events/amd/power.c b/arch/x86/events/amd/power.c > > index 55a3529..778b77d 100644 > > --- a/arch/x86/events/amd/power.c > > +++ b/arch/x86/events/amd/power.c > > @@ -292,6 +292,7 @@ static struct notifier_block power_cpu_notifier_nb = { > > > > static const struct x86_cpu_id cpu_match[] = { > > { .vendor = X86_VENDOR_AMD, .family = 0x15 }, > > + { .vendor = X86_VENDOR_AMD, .family = 0x16 }, > > {}, > > Actually, I think we remove that table completely and rely solely on the > CPUID bit: > > if (!boot_cpu_has(X86_FEATURE_ACC_POWER)) > return -ENODEV; > > Rui? > Agree with you. If the some chips are not stable, we can add a check to ignore them with family and model id. Thanks, Rui
Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism
On Thu, Jun 16, 2016 at 04:44:20PM -0400, Vince Weaver wrote: > On Thu, 16 Jun 2016, Huang Rui wrote: > > > > 1. In theory this should also work on an amd fam16h model 30h > > > processor too, correct? The current code limits things to fam15h > > > even though the fam16mod30h has all the proper cpuid flags. > > > > > > > I was told this feature would be supported on fam15h 60h, 70h and > > later processors before. Just checked the fam16h model 30h BKDG, yes, > > it should be also supported. But I didn't test that platform, if you > > confirm it works in your side. We can enable it. > > I can confirm I get power readings on my fam16hmod30h board once I apply a > trivial patch to the driver. I'll send the patch in a separate e-mail. > OK, thanks. > > PTSC's frequency is about 100Mhz, it shouldn't be overflow. > > That's what I thought. I'm trying to read the value using the /dev/msr > interface from userspace and I get weird results. > > i.e.: > Jx: read 62d299b84 > PTSC MSR: read 72fe92 > > sleep 5ms > > Jy: read 631b453b9 > PTSC MSR: read 46b25 > > this happens about half the time (PTSC going backwards). Though > admittedly the problem could somehow be in the MSR code I'm using. > Can you try to read the MSR value two times with the same core (rdmsrl_on_cpu)? Thanks, Rui
Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism
On Thu, Jun 16, 2016 at 04:44:20PM -0400, Vince Weaver wrote: > On Thu, 16 Jun 2016, Huang Rui wrote: > > > > 1. In theory this should also work on an amd fam16h model 30h > > > processor too, correct? The current code limits things to fam15h > > > even though the fam16mod30h has all the proper cpuid flags. > > > > > > > I was told this feature would be supported on fam15h 60h, 70h and > > later processors before. Just checked the fam16h model 30h BKDG, yes, > > it should be also supported. But I didn't test that platform, if you > > confirm it works in your side. We can enable it. > > I can confirm I get power readings on my fam16hmod30h board once I apply a > trivial patch to the driver. I'll send the patch in a separate e-mail. > OK, thanks. > > PTSC's frequency is about 100Mhz, it shouldn't be overflow. > > That's what I thought. I'm trying to read the value using the /dev/msr > interface from userspace and I get weird results. > > i.e.: > Jx: read 62d299b84 > PTSC MSR: read 72fe92 > > sleep 5ms > > Jy: read 631b453b9 > PTSC MSR: read 46b25 > > this happens about half the time (PTSC going backwards). Though > admittedly the problem could somehow be in the MSR code I'm using. > Can you try to read the MSR value two times with the same core (rdmsrl_on_cpu)? Thanks, Rui
Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism
On Thu, Jun 16, 2016 at 06:47:00PM +0200, Borislav Petkov wrote: > On Thu, Jun 16, 2016 at 01:38:14PM +0800, Huang Rui wrote: > > I was told this feature would be supported on fam15h 60h, 70h and > > later processors before. Just checked the fam16h model 30h BKDG, yes, > > it should be also supported. But I didn't test that platform, if you > > confirm it works in your side. We can enable it. > > You might want to ask around first whether F16M30's acc power machinery > is even usable? I.e., no errata and whatnot... > Yep, I already asked the designers, and was waiting for their feedbacks. Thanks, Rui
Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism
On Thu, Jun 16, 2016 at 06:47:00PM +0200, Borislav Petkov wrote: > On Thu, Jun 16, 2016 at 01:38:14PM +0800, Huang Rui wrote: > > I was told this feature would be supported on fam15h 60h, 70h and > > later processors before. Just checked the fam16h model 30h BKDG, yes, > > it should be also supported. But I didn't test that platform, if you > > confirm it works in your side. We can enable it. > > You might want to ask around first whether F16M30's acc power machinery > is even usable? I.e., no errata and whatnot... > Yep, I already asked the designers, and was waiting for their feedbacks. Thanks, Rui
Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism
Hi Vince, Thanks for asking. On Wed, Jun 15, 2016 at 09:13:59PM -0400, Vince Weaver wrote: > > three questions about this functionality: > > 1. In theory this should also work on an amd fam16h model 30h > processor too, correct? The current code limits things to fam15h > even though the fam16mod30h has all the proper cpuid flags. > I was told this feature would be supported on fam15h 60h, 70h and later processors before. Just checked the fam16h model 30h BKDG, yes, it should be also supported. But I didn't test that platform, if you confirm it works in your side. We can enable it. > I've tested the functionality a bit and it seems to work but for > some reason the ptsc seems to occasionally count backwards > on my machine. Any reason that would be? (It doesn't seem to be > an overflow, just reading the ptsc 5ms apart and the values are > slightly lower after than before). > PTSC's frequency is about 100Mhz, it shouldn't be overflow. > 2. Unless I'm misunderstanding things, the code seems to be accumulating > Power. (see chunk below) Power is an instantaneous measurement, it > makes no sense to add values. If you use 5W for 1ms and 10W for > 1ms, the average power across the 2ms interval is not 15W. > > You can add energy, but not power. > > > + delta *= cpu_pwr_sample_ratio * 1000; > > + tdelta = new_ptsc - prev_ptsc; > > + > > + do_div(delta, tdelta); > > + local64_add(delta, >count); > You're right. Nice catch! The average power is per compute unit. We cannot add the power simplely for each processor/package. So here, the average power per package should be (delta1 + delta2 + ... + deltaN)/(tdelta_avg). I will work out a fix. Thanks to point out. > 3. The actual results gathered seem rediculously low. 341 seconds of > calculation and only using 183 mWatts of power? > mWatts are for processor power not system power. Below data is calculated on fam15h model 60h which is low power platform. Even though the method has a minor mistake, the processor power should be in mWatts field. > >Performance counter stats for 'system wide': > > > > 183.44 mWatts power/power-pkg/ > > > >341.837270111 seconds time elapsed > > > > root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' > > sleep 10 > Thanks, Rui
Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism
Hi Vince, Thanks for asking. On Wed, Jun 15, 2016 at 09:13:59PM -0400, Vince Weaver wrote: > > three questions about this functionality: > > 1. In theory this should also work on an amd fam16h model 30h > processor too, correct? The current code limits things to fam15h > even though the fam16mod30h has all the proper cpuid flags. > I was told this feature would be supported on fam15h 60h, 70h and later processors before. Just checked the fam16h model 30h BKDG, yes, it should be also supported. But I didn't test that platform, if you confirm it works in your side. We can enable it. > I've tested the functionality a bit and it seems to work but for > some reason the ptsc seems to occasionally count backwards > on my machine. Any reason that would be? (It doesn't seem to be > an overflow, just reading the ptsc 5ms apart and the values are > slightly lower after than before). > PTSC's frequency is about 100Mhz, it shouldn't be overflow. > 2. Unless I'm misunderstanding things, the code seems to be accumulating > Power. (see chunk below) Power is an instantaneous measurement, it > makes no sense to add values. If you use 5W for 1ms and 10W for > 1ms, the average power across the 2ms interval is not 15W. > > You can add energy, but not power. > > > + delta *= cpu_pwr_sample_ratio * 1000; > > + tdelta = new_ptsc - prev_ptsc; > > + > > + do_div(delta, tdelta); > > + local64_add(delta, >count); > You're right. Nice catch! The average power is per compute unit. We cannot add the power simplely for each processor/package. So here, the average power per package should be (delta1 + delta2 + ... + deltaN)/(tdelta_avg). I will work out a fix. Thanks to point out. > 3. The actual results gathered seem rediculously low. 341 seconds of > calculation and only using 183 mWatts of power? > mWatts are for processor power not system power. Below data is calculated on fam15h model 60h which is low power platform. Even though the method has a minor mistake, the processor power should be in mWatts field. > >Performance counter stats for 'system wide': > > > > 183.44 mWatts power/power-pkg/ > > > >341.837270111 seconds time elapsed > > > > root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' > > sleep 10 > Thanks, Rui
Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism
On Thu, Jun 16, 2016 at 01:38:13PM +0800, Huang Rui wrote: > On Wed, Jun 15, 2016 at 09:13:59PM -0400, Vince Weaver wrote: > > > > 2. Unless I'm misunderstanding things, the code seems to be accumulating > > Power. (see chunk below) Power is an instantaneous measurement, it > > makes no sense to add values. If you use 5W for 1ms and 10W for > > 1ms, the average power across the 2ms interval is not 15W. > > > > You can add energy, but not power. > > > > > + delta *= cpu_pwr_sample_ratio * 1000; > > > + tdelta = new_ptsc - prev_ptsc; > > > + > > > + do_div(delta, tdelta); > > > + local64_add(delta, >count); > > > > You're right. Nice catch! The average power is per compute unit. We > cannot add the power simplely for each processor/package. > > So here, the average power per package should be (delta1 + delta2 + ... + > deltaN)/(tdelta_avg). > I will work out a fix. Thanks to point out. > After considering carefully, the original method should be OK. AMD nomenclature for CMT systems: [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0 -> [Compute Unit Core 1] -> Linux CPU 1 -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2 -> [Compute Unit Core 1] -> Linux CPU 3 The deltaN is power per compute unit. Current one package has two CUs. In the *same* interval, CU0's power is 10W, CU1's power is 15W. The package (CU0 + CU1) power should be 15W, right? Because the interval is the same. Q = Q1 + Q2. P = Q/t = (Q1 + Q2)/t = Q1/t + Q2/t = P1 + P2. Is that clear? Thanks, Rui
Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism
On Thu, Jun 16, 2016 at 01:38:13PM +0800, Huang Rui wrote: > On Wed, Jun 15, 2016 at 09:13:59PM -0400, Vince Weaver wrote: > > > > 2. Unless I'm misunderstanding things, the code seems to be accumulating > > Power. (see chunk below) Power is an instantaneous measurement, it > > makes no sense to add values. If you use 5W for 1ms and 10W for > > 1ms, the average power across the 2ms interval is not 15W. > > > > You can add energy, but not power. > > > > > + delta *= cpu_pwr_sample_ratio * 1000; > > > + tdelta = new_ptsc - prev_ptsc; > > > + > > > + do_div(delta, tdelta); > > > + local64_add(delta, >count); > > > > You're right. Nice catch! The average power is per compute unit. We > cannot add the power simplely for each processor/package. > > So here, the average power per package should be (delta1 + delta2 + ... + > deltaN)/(tdelta_avg). > I will work out a fix. Thanks to point out. > After considering carefully, the original method should be OK. AMD nomenclature for CMT systems: [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0 -> [Compute Unit Core 1] -> Linux CPU 1 -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2 -> [Compute Unit Core 1] -> Linux CPU 3 The deltaN is power per compute unit. Current one package has two CUs. In the *same* interval, CU0's power is 10W, CU1's power is 15W. The package (CU0 + CU1) power should be 15W, right? Because the interval is the same. Q = Q1 + Q2. P = Q/t = (Q1 + Q2)/t = Q1/t + Q2/t = P1 + P2. Is that clear? Thanks, Rui
Re: [PATCH] hwmon: (fam15h_power) Disable preemption when reading registers
On Sat, Jun 04, 2016 at 07:55:08AM -0700, Guenter Roeck wrote: > On 06/03/2016 11:29 AM, Borislav Petkov wrote: > >On Fri, Jun 03, 2016 at 11:18:36AM -0700, Guenter Roeck wrote: > >>I like this version. Applied, though it would be nice to get a > >>Tested-by: or Acked-by: from someone at AMD. I'll hold back until > >>early next week before sending it to Linus. > > > >If it makes you feel any better: I have the affected hardware and am > >testing on it too. I wouldnt've caught it otherwise. > > > > I'll send it to Linus next week unless I get an objection. > Guenter, I already verified on AMD CZ(family 15h, model 60h) and model 70h processors. Acked-and-tested-by: Huang Rui <ray.hu...@amd.com>
Re: [PATCH] hwmon: (fam15h_power) Disable preemption when reading registers
On Sat, Jun 04, 2016 at 07:55:08AM -0700, Guenter Roeck wrote: > On 06/03/2016 11:29 AM, Borislav Petkov wrote: > >On Fri, Jun 03, 2016 at 11:18:36AM -0700, Guenter Roeck wrote: > >>I like this version. Applied, though it would be nice to get a > >>Tested-by: or Acked-by: from someone at AMD. I'll hold back until > >>early next week before sending it to Linus. > > > >If it makes you feel any better: I have the affected hardware and am > >testing on it too. I wouldnt've caught it otherwise. > > > > I'll send it to Linus next week unless I get an objection. > Guenter, I already verified on AMD CZ(family 15h, model 60h) and model 70h processors. Acked-and-tested-by: Huang Rui
Re: [PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power
On Wed, Apr 06, 2016 at 08:30:25AM -0700, Guenter Roeck wrote: > On Wed, Apr 06, 2016 at 03:44:11PM +0800, Huang Rui wrote: > > > > +static void do_read_registers_on_cu(void *_data) > > +{ > > + struct fam15h_power_data *data = _data; > > + int cpu, cu; > > + > > + cpu = smp_processor_id(); > > + > > Is this function now defined in non-SMP code ? If so, can you point me to the > patch or branch introducing it ? It doesn't seem to be in mainline or in -next > unless I am missing it. > In include/linux/smp.h #else /* !SMP */ static inline void smp_send_stop(void) { } /* * These macros fold the SMP functionality into a single CPU system */ #define raw_smp_processor_id() 0 ... /* * smp_processor_id(): get the current CPU ID. * * if DEBUG_PREEMPT is enabled then we check whether it is * used in a preemption-safe way. (smp_processor_id() is safe * if it's used in a preemption-off critical section, or in * a thread that is bound to the current CPU.) * * NOTE: raw_smp_processor_id() is for internal use only * (smp_processor_id() is the preferred variant), but in rare * instances it might also be used to turn off false positives * (i.e. smp_processor_id() use that the debugging code reports but * which use for some reason is legal). Don't use this to hack around * the warning message, as your code might not work under PREEMPT. */ #ifdef CONFIG_DEBUG_PREEMPT extern unsigned int debug_smp_processor_id(void); # define smp_processor_id() debug_smp_processor_id() #else # define smp_processor_id() raw_smp_processor_id() #endif Actually smp_processor_id() should returns 0 if we disable CONFIG_SMP. > > + /* > > +* With the new x86 topology modelling, cpu core id actually > > +* is compute unit id. > > +*/ > > + cu = cpu_data(cpu).cpu_core_id; > > + > > + rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); > > +} > > + > > +/* > > + * This function is only able to be called when CPUID > > + * Fn8000_0007:EDX[12] is set. > > + */ > > +static int read_registers(struct fam15h_power_data *data) > > +{ > > + int this_cpu, ret, cpu; > > + int core, this_core; > > + cpumask_var_t mask; > > + > > + ret = zalloc_cpumask_var(, GFP_KERNEL); > > + if (!ret) > > + return -ENOMEM; > > + > > + get_online_cpus(); > > + this_cpu = smp_processor_id(); > > + > > + /* > > +* Choose the first online core of each compute unit, and then > > +* read their MSR value of power and ptsc in a single IPI, > > +* because the MSR value of CPU core represent the compute > > +* unit's. > > +*/ > > + core = -1; > > + > > + for_each_online_cpu(cpu) { > > + this_core = topology_core_id(cpu); > > + > > + if (this_core == core) > > + continue; > > + > > + core = this_core; > > + > Sorry if I missed some context - is it guaranteed that all cores in the same > compute unit are returned next to each other from for_each_online_cpu() ? > Yes, there is a documentation which introduced from v4.6-rc2: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f7be8610bca88e59dd2fd5d98fcbc5031ef0e079 - topology_core_id(); The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo "core_id." ... AMD nomenclature for CMT systems: [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0 -> [Compute Unit Core 1] -> Linux CPU 1 -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2 -> [Compute Unit Core 1] -> Linux CPU 3 ray@hr-ub:~/tip$ cat /proc/cpuinfo | grep "core id" core id : 0 core id : 0 core id : 1 core id : 1 "this_core" here actually means the [Compute Unit] id which current [Compute Unit Core] belongs to. And "cpu" here means the [Compute Unit Core]. Thanks, Rui
Re: [PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power
On Wed, Apr 06, 2016 at 08:30:25AM -0700, Guenter Roeck wrote: > On Wed, Apr 06, 2016 at 03:44:11PM +0800, Huang Rui wrote: > > > > +static void do_read_registers_on_cu(void *_data) > > +{ > > + struct fam15h_power_data *data = _data; > > + int cpu, cu; > > + > > + cpu = smp_processor_id(); > > + > > Is this function now defined in non-SMP code ? If so, can you point me to the > patch or branch introducing it ? It doesn't seem to be in mainline or in -next > unless I am missing it. > In include/linux/smp.h #else /* !SMP */ static inline void smp_send_stop(void) { } /* * These macros fold the SMP functionality into a single CPU system */ #define raw_smp_processor_id() 0 ... /* * smp_processor_id(): get the current CPU ID. * * if DEBUG_PREEMPT is enabled then we check whether it is * used in a preemption-safe way. (smp_processor_id() is safe * if it's used in a preemption-off critical section, or in * a thread that is bound to the current CPU.) * * NOTE: raw_smp_processor_id() is for internal use only * (smp_processor_id() is the preferred variant), but in rare * instances it might also be used to turn off false positives * (i.e. smp_processor_id() use that the debugging code reports but * which use for some reason is legal). Don't use this to hack around * the warning message, as your code might not work under PREEMPT. */ #ifdef CONFIG_DEBUG_PREEMPT extern unsigned int debug_smp_processor_id(void); # define smp_processor_id() debug_smp_processor_id() #else # define smp_processor_id() raw_smp_processor_id() #endif Actually smp_processor_id() should returns 0 if we disable CONFIG_SMP. > > + /* > > +* With the new x86 topology modelling, cpu core id actually > > +* is compute unit id. > > +*/ > > + cu = cpu_data(cpu).cpu_core_id; > > + > > + rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); > > +} > > + > > +/* > > + * This function is only able to be called when CPUID > > + * Fn8000_0007:EDX[12] is set. > > + */ > > +static int read_registers(struct fam15h_power_data *data) > > +{ > > + int this_cpu, ret, cpu; > > + int core, this_core; > > + cpumask_var_t mask; > > + > > + ret = zalloc_cpumask_var(, GFP_KERNEL); > > + if (!ret) > > + return -ENOMEM; > > + > > + get_online_cpus(); > > + this_cpu = smp_processor_id(); > > + > > + /* > > +* Choose the first online core of each compute unit, and then > > +* read their MSR value of power and ptsc in a single IPI, > > +* because the MSR value of CPU core represent the compute > > +* unit's. > > +*/ > > + core = -1; > > + > > + for_each_online_cpu(cpu) { > > + this_core = topology_core_id(cpu); > > + > > + if (this_core == core) > > + continue; > > + > > + core = this_core; > > + > Sorry if I missed some context - is it guaranteed that all cores in the same > compute unit are returned next to each other from for_each_online_cpu() ? > Yes, there is a documentation which introduced from v4.6-rc2: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f7be8610bca88e59dd2fd5d98fcbc5031ef0e079 - topology_core_id(); The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo "core_id." ... AMD nomenclature for CMT systems: [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0 -> [Compute Unit Core 1] -> Linux CPU 1 -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2 -> [Compute Unit Core 1] -> Linux CPU 3 ray@hr-ub:~/tip$ cat /proc/cpuinfo | grep "core id" core id : 0 core id : 0 core id : 1 core id : 1 "this_core" here actually means the [Compute Unit] id which current [Compute Unit Core] belongs to. And "cpu" here means the [Compute Unit Core]. Thanks, Rui
[PATCH v6 3/6] hwmon: (fam15h_power) Add ptsc counter value for accumulated power
PTSC is the performance timestamp counter value in a cpu core and the cores in one compute unit have the fixed frequency. So it picks up the performance timestamp counter value of the first core per compute unit to measure the interval for average power per compute unit. Signed-off-by: Huang Rui <ray.hu...@amd.com> Cc: Borislav Petkov <b...@alien8.de> --- drivers/hwmon/fam15h_power.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 4edbaf0..336d422 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -50,6 +50,7 @@ MODULE_LICENSE("GPL"); #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b +#define MSR_F15H_PTSC 0xc0010280 #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4 @@ -65,6 +66,8 @@ struct fam15h_power_data { u64 max_cu_acc_power; /* accumulated power of the compute units */ u64 cu_acc_power[MAX_CUS]; + /* performance timestamp counter */ + u64 cpu_sw_pwr_ptsc[MAX_CUS]; }; static ssize_t show_power(struct device *dev, @@ -145,6 +148,7 @@ static void do_read_registers_on_cu(void *_data) cu = cpu_data(cpu).cpu_core_id; rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); + rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]); } /* -- 1.9.1
[PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power
This patch adds a member in fam15h_power_data which specifies the compute unit accumulated power. It adds do_read_registers_on_cu to do all the read to all MSRs and run it on one of the online cores on each compute unit with smp_call_function_many(). This behavior can decrease IPI numbers. Suggested-by: Borislav Petkov <b...@alien8.de> Signed-off-by: Huang Rui <ray.hu...@amd.com> --- drivers/hwmon/fam15h_power.c | 72 +++- 1 file changed, 71 insertions(+), 1 deletion(-) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 4f695d8..4edbaf0 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -25,6 +25,8 @@ #include #include #include +#include +#include #include #include @@ -44,7 +46,9 @@ MODULE_LICENSE("GPL"); #define FAM15H_MIN_NUM_ATTRS 2 #define FAM15H_NUM_GROUPS 2 +#define MAX_CUS8 +#define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4 @@ -59,6 +63,8 @@ struct fam15h_power_data { struct attribute_group group; /* maximum accumulated power of a compute unit */ u64 max_cu_acc_power; + /* accumulated power of the compute units */ + u64 cu_acc_power[MAX_CUS]; }; static ssize_t show_power(struct device *dev, @@ -125,6 +131,70 @@ static ssize_t show_power_crit(struct device *dev, } static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL); +static void do_read_registers_on_cu(void *_data) +{ + struct fam15h_power_data *data = _data; + int cpu, cu; + + cpu = smp_processor_id(); + + /* +* With the new x86 topology modelling, cpu core id actually +* is compute unit id. +*/ + cu = cpu_data(cpu).cpu_core_id; + + rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); +} + +/* + * This function is only able to be called when CPUID + * Fn8000_0007:EDX[12] is set. + */ +static int read_registers(struct fam15h_power_data *data) +{ + int this_cpu, ret, cpu; + int core, this_core; + cpumask_var_t mask; + + ret = zalloc_cpumask_var(, GFP_KERNEL); + if (!ret) + return -ENOMEM; + + get_online_cpus(); + this_cpu = smp_processor_id(); + + /* +* Choose the first online core of each compute unit, and then +* read their MSR value of power and ptsc in a single IPI, +* because the MSR value of CPU core represent the compute +* unit's. +*/ + core = -1; + + for_each_online_cpu(cpu) { + this_core = topology_core_id(cpu); + + if (this_core == core) + continue; + + core = this_core; + + /* get any CPU on this compute unit */ + cpumask_set_cpu(cpumask_any(topology_sibling_cpumask(cpu)), mask); + } + + if (cpumask_test_cpu(this_cpu, mask)) + do_read_registers_on_cu(data); + + smp_call_function_many(mask, do_read_registers_on_cu, data, true); + put_online_cpus(); + + free_cpumask_var(mask); + + return 0; +} + static int fam15h_power_init_attrs(struct pci_dev *pdev, struct fam15h_power_data *data) { @@ -263,7 +333,7 @@ static int fam15h_power_init_data(struct pci_dev *f4, data->max_cu_acc_power = tmp; - return 0; + return read_registers(data); } static int fam15h_power_probe(struct pci_dev *pdev, -- 1.9.1
[PATCH v6 3/6] hwmon: (fam15h_power) Add ptsc counter value for accumulated power
PTSC is the performance timestamp counter value in a cpu core and the cores in one compute unit have the fixed frequency. So it picks up the performance timestamp counter value of the first core per compute unit to measure the interval for average power per compute unit. Signed-off-by: Huang Rui Cc: Borislav Petkov --- drivers/hwmon/fam15h_power.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 4edbaf0..336d422 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -50,6 +50,7 @@ MODULE_LICENSE("GPL"); #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b +#define MSR_F15H_PTSC 0xc0010280 #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4 @@ -65,6 +66,8 @@ struct fam15h_power_data { u64 max_cu_acc_power; /* accumulated power of the compute units */ u64 cu_acc_power[MAX_CUS]; + /* performance timestamp counter */ + u64 cpu_sw_pwr_ptsc[MAX_CUS]; }; static ssize_t show_power(struct device *dev, @@ -145,6 +148,7 @@ static void do_read_registers_on_cu(void *_data) cu = cpu_data(cpu).cpu_core_id; rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); + rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]); } /* -- 1.9.1
[PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power
This patch adds a member in fam15h_power_data which specifies the compute unit accumulated power. It adds do_read_registers_on_cu to do all the read to all MSRs and run it on one of the online cores on each compute unit with smp_call_function_many(). This behavior can decrease IPI numbers. Suggested-by: Borislav Petkov Signed-off-by: Huang Rui --- drivers/hwmon/fam15h_power.c | 72 +++- 1 file changed, 71 insertions(+), 1 deletion(-) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 4f695d8..4edbaf0 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -25,6 +25,8 @@ #include #include #include +#include +#include #include #include @@ -44,7 +46,9 @@ MODULE_LICENSE("GPL"); #define FAM15H_MIN_NUM_ATTRS 2 #define FAM15H_NUM_GROUPS 2 +#define MAX_CUS8 +#define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4 @@ -59,6 +63,8 @@ struct fam15h_power_data { struct attribute_group group; /* maximum accumulated power of a compute unit */ u64 max_cu_acc_power; + /* accumulated power of the compute units */ + u64 cu_acc_power[MAX_CUS]; }; static ssize_t show_power(struct device *dev, @@ -125,6 +131,70 @@ static ssize_t show_power_crit(struct device *dev, } static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL); +static void do_read_registers_on_cu(void *_data) +{ + struct fam15h_power_data *data = _data; + int cpu, cu; + + cpu = smp_processor_id(); + + /* +* With the new x86 topology modelling, cpu core id actually +* is compute unit id. +*/ + cu = cpu_data(cpu).cpu_core_id; + + rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); +} + +/* + * This function is only able to be called when CPUID + * Fn8000_0007:EDX[12] is set. + */ +static int read_registers(struct fam15h_power_data *data) +{ + int this_cpu, ret, cpu; + int core, this_core; + cpumask_var_t mask; + + ret = zalloc_cpumask_var(, GFP_KERNEL); + if (!ret) + return -ENOMEM; + + get_online_cpus(); + this_cpu = smp_processor_id(); + + /* +* Choose the first online core of each compute unit, and then +* read their MSR value of power and ptsc in a single IPI, +* because the MSR value of CPU core represent the compute +* unit's. +*/ + core = -1; + + for_each_online_cpu(cpu) { + this_core = topology_core_id(cpu); + + if (this_core == core) + continue; + + core = this_core; + + /* get any CPU on this compute unit */ + cpumask_set_cpu(cpumask_any(topology_sibling_cpumask(cpu)), mask); + } + + if (cpumask_test_cpu(this_cpu, mask)) + do_read_registers_on_cu(data); + + smp_call_function_many(mask, do_read_registers_on_cu, data, true); + put_online_cpus(); + + free_cpumask_var(mask); + + return 0; +} + static int fam15h_power_init_attrs(struct pci_dev *pdev, struct fam15h_power_data *data) { @@ -263,7 +333,7 @@ static int fam15h_power_init_data(struct pci_dev *f4, data->max_cu_acc_power = tmp; - return 0; + return read_registers(data); } static int fam15h_power_probe(struct pci_dev *pdev, -- 1.9.1
[PATCH v6 5/6] hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm
This patch adds the description to explain the TDP reporting mechanism and accumulated power algorithm. Signed-off-by: Huang Rui <ray.hu...@amd.com> Cc: Borislav Petkov <b...@alien8.de> --- Documentation/hwmon/fam15h_power | 65 +++- drivers/hwmon/fam15h_power.c | 2 +- 2 files changed, 65 insertions(+), 2 deletions(-) diff --git a/Documentation/hwmon/fam15h_power b/Documentation/hwmon/fam15h_power index e2b1b69..fb594c2 100644 --- a/Documentation/hwmon/fam15h_power +++ b/Documentation/hwmon/fam15h_power @@ -10,14 +10,22 @@ Supported chips: Datasheets: BIOS and Kernel Developer's Guide (BKDG) For AMD Family 15h Processors BIOS and Kernel Developer's Guide (BKDG) For AMD Family 16h Processors + AMD64 Architecture Programmer's Manual Volume 2: System Programming Author: Andreas Herrmann <herrmann.der.u...@googlemail.com> Description --- +1) Processor TDP (Thermal design power) + +Given a fixed frequency and voltage, the power consumption of a +processor varies based on the workload being executed. Derated power +is the power consumed when running a specific application. Thermal +design power (TDP) is an example of derated power. + This driver permits reading of registers providing power information -of AMD Family 15h and 16h processors. +of AMD Family 15h and 16h processors via TDP algorithm. For AMD Family 15h and 16h processors the following power values can be calculated using different processor northbridge function @@ -37,3 +45,58 @@ This driver provides ProcessorPwrWatts and CurrPwrWatts: On multi-node processors the calculated value is for the entire package and not for a single node. Thus the driver creates sysfs attributes only for internal node0 of a multi-node processor. + +2) Accumulated Power Mechanism + +This driver also introduces an algorithm that should be used to +calculate the average power consumed by a processor during a +measurement interval Tm. The feature of accumulated power mechanism is +indicated by CPUID Fn8000_0007_EDX[12]. + +* Tsample: compute unit power accumulator sample period +* Tref: the PTSC counter period +* PTSC: performance timestamp counter +* N: the ratio of compute unit power accumulator sample period to the + PTSC period +* Jmax: max compute unit accumulated power which is indicated by + MaxCpuSwPwrAcc MSR C001007b +* Jx/Jy: compute unit accumulated power which is indicated by + CpuSwPwrAcc MSR C001007a +* Tx/Ty: the value of performance timestamp counter which is indicated + by CU_PTSC MSR C0010280 +* PwrCPUave: CPU average power + +i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007. + N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]]. + +ii. Read the full range of the cumulative energy value from the new +MSR MaxCpuSwPwrAcc. + Jmax = value returned. +iii. At time x, SW reads CpuSwPwrAcc MSR and samples the PTSC. + Jx = value read from CpuSwPwrAcc and Tx = value read from +PTSC. + +iv. At time y, SW reads CpuSwPwrAcc MSR and samples the PTSC. + Jy = value read from CpuSwPwrAcc and Ty = value read from +PTSC. + +v. Calculate the average power consumption for a compute unit over +time period (y-x). Unit of result is uWatt. + if (Jy < Jx) // Rollover has occurred + Jdelta = (Jy + Jmax) - Jx + else + Jdelta = Jy - Jx + PwrCPUave = N * Jdelta * 1000 / (Ty - Tx) + +This driver provides PwrCPUave and interval(default is 10 millisecond +and maximum is 1 second): +* power1_average (PwrCPUave) +* power1_average_interval (Interval) + +The power1_average_interval can be updated at /etc/sensors3.conf file +as below: + +chip "fam15h_power-*" + set power1_average_interval 0.01 + +Then save it with "sensors -s". diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 5abbfa8..7d9d697 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -1,7 +1,7 @@ /* * fam15h_power.c - AMD Family 15h processor power monitoring * - * Copyright (c) 2011 Advanced Micro Devices, Inc. + * Copyright (c) 2011-2016 Advanced Micro Devices, Inc. * Author: Andreas Herrmann <herrmann.der.u...@googlemail.com> * * -- 1.9.1
[PATCH v6 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm
This patch introduces an algorithm that computes the average power by reading a delta value of “core power accumulator” register during measurement interval, and then dividing delta value by the length of the time interval. User is able to use power1_average entry to measure the processor power consumption and power1_average_interval entry to set the interval. A simple example: ray@hr-ub:~/tip$ sensors fam15h_power-pci-00c4 Adapter: PCI adapter power1: 19.58 mW (avg = 2.55 mW, interval = 0.01 s) (crit = 15.00 W) ... The result is current average processor power consumption in 10 millisecond. The unit of the result is uWatt. Suggested-by: Guenter Roeck <li...@roeck-us.net> Signed-off-by: Huang Rui <ray.hu...@amd.com> Cc: Borislav Petkov <b...@alien8.de> --- drivers/hwmon/fam15h_power.c | 128 +-- 1 file changed, 124 insertions(+), 4 deletions(-) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 336d422..5abbfa8 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -27,6 +27,8 @@ #include #include #include +#include +#include #include #include @@ -48,6 +50,9 @@ MODULE_LICENSE("GPL"); #define FAM15H_NUM_GROUPS 2 #define MAX_CUS8 +/* set maximum interval as 1 second */ +#define MAX_INTERVAL 1000 + #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b #define MSR_F15H_PTSC 0xc0010280 @@ -68,6 +73,9 @@ struct fam15h_power_data { u64 cu_acc_power[MAX_CUS]; /* performance timestamp counter */ u64 cpu_sw_pwr_ptsc[MAX_CUS]; + /* online/offline status of current compute unit */ + int cu_on[MAX_CUS]; + unsigned long power_period; }; static ssize_t show_power(struct device *dev, @@ -149,6 +157,8 @@ static void do_read_registers_on_cu(void *_data) rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]); + + data->cu_on[cu] = 1; } /* @@ -165,6 +175,8 @@ static int read_registers(struct fam15h_power_data *data) if (!ret) return -ENOMEM; + memset(data->cu_on, 0, sizeof(int) * MAX_CUS); + get_online_cpus(); this_cpu = smp_processor_id(); @@ -199,6 +211,98 @@ static int read_registers(struct fam15h_power_data *data) return 0; } +static ssize_t acc_show_power(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct fam15h_power_data *data = dev_get_drvdata(dev); + u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS], + jdelta[MAX_CUS]; + u64 tdelta, avg_acc; + int cu, cu_num, ret; + signed long leftover; + + /* +* With the new x86 topology modelling, x86_max_cores is the +* compute unit number. +*/ + cu_num = boot_cpu_data.x86_max_cores; + + ret = read_registers(data); + if (ret) + return 0; + + for (cu = 0; cu < cu_num; cu++) { + prev_cu_acc_power[cu] = data->cu_acc_power[cu]; + prev_ptsc[cu] = data->cpu_sw_pwr_ptsc[cu]; + } + + leftover = schedule_timeout_interruptible(msecs_to_jiffies(data->power_period)); + if (leftover) + return 0; + + ret = read_registers(data); + if (ret) + return 0; + + for (cu = 0, avg_acc = 0; cu < cu_num; cu++) { + /* check if current compute unit is online */ + if (data->cu_on[cu] == 0) + continue; + + if (data->cu_acc_power[cu] < prev_cu_acc_power[cu]) { + jdelta[cu] = data->max_cu_acc_power + data->cu_acc_power[cu]; + jdelta[cu] -= prev_cu_acc_power[cu]; + } else { + jdelta[cu] = data->cu_acc_power[cu] - prev_cu_acc_power[cu]; + } + tdelta = data->cpu_sw_pwr_ptsc[cu] - prev_ptsc[cu]; + jdelta[cu] *= data->cpu_pwr_sample_ratio * 1000; + do_div(jdelta[cu], tdelta); + + /* the unit is microWatt */ + avg_acc += jdelta[cu]; + } + + return sprintf(buf, "%llu\n", (unsigned long long)avg_acc); +} +static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL); + +static ssize_t acc_show_power_period(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct fam15h_power_data *data = dev_get_drvdata(dev); + + return sprintf(buf, "%lu\n", data->power_period); +} + +static ssize_t acc_
[PATCH v6 6/6] hwmon: (fam15h_power) Add platform check function
This patch adds a platform check function to make code more readable. Signed-off-by: Huang Rui <ray.hu...@amd.com> --- drivers/hwmon/fam15h_power.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 7d9d697..eb97a92 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -78,6 +78,11 @@ struct fam15h_power_data { unsigned long power_period; }; +static bool is_carrizo_or_later(void) +{ + return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60; +} + static ssize_t show_power(struct device *dev, struct device_attribute *attr, char *buf) { @@ -94,7 +99,7 @@ static ssize_t show_power(struct device *dev, * On Carrizo and later platforms, TdpRunAvgAccCap bit field * is extended to 4:31 from 4:25. */ - if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60) { + if (is_carrizo_or_later()) { running_avg_capture = val >> 4; running_avg_capture = sign_extend32(running_avg_capture, 27); } else { @@ -111,7 +116,7 @@ static ssize_t show_power(struct device *dev, * On Carrizo and later platforms, ApmTdpLimit bit field * is extended to 16:31 from 16:28. */ - if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60) + if (is_carrizo_or_later()) tdp_limit = val >> 16; else tdp_limit = (val >> 16) & 0x1fff; -- 1.9.1
[PATCH v6 5/6] hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm
This patch adds the description to explain the TDP reporting mechanism and accumulated power algorithm. Signed-off-by: Huang Rui Cc: Borislav Petkov --- Documentation/hwmon/fam15h_power | 65 +++- drivers/hwmon/fam15h_power.c | 2 +- 2 files changed, 65 insertions(+), 2 deletions(-) diff --git a/Documentation/hwmon/fam15h_power b/Documentation/hwmon/fam15h_power index e2b1b69..fb594c2 100644 --- a/Documentation/hwmon/fam15h_power +++ b/Documentation/hwmon/fam15h_power @@ -10,14 +10,22 @@ Supported chips: Datasheets: BIOS and Kernel Developer's Guide (BKDG) For AMD Family 15h Processors BIOS and Kernel Developer's Guide (BKDG) For AMD Family 16h Processors + AMD64 Architecture Programmer's Manual Volume 2: System Programming Author: Andreas Herrmann Description --- +1) Processor TDP (Thermal design power) + +Given a fixed frequency and voltage, the power consumption of a +processor varies based on the workload being executed. Derated power +is the power consumed when running a specific application. Thermal +design power (TDP) is an example of derated power. + This driver permits reading of registers providing power information -of AMD Family 15h and 16h processors. +of AMD Family 15h and 16h processors via TDP algorithm. For AMD Family 15h and 16h processors the following power values can be calculated using different processor northbridge function @@ -37,3 +45,58 @@ This driver provides ProcessorPwrWatts and CurrPwrWatts: On multi-node processors the calculated value is for the entire package and not for a single node. Thus the driver creates sysfs attributes only for internal node0 of a multi-node processor. + +2) Accumulated Power Mechanism + +This driver also introduces an algorithm that should be used to +calculate the average power consumed by a processor during a +measurement interval Tm. The feature of accumulated power mechanism is +indicated by CPUID Fn8000_0007_EDX[12]. + +* Tsample: compute unit power accumulator sample period +* Tref: the PTSC counter period +* PTSC: performance timestamp counter +* N: the ratio of compute unit power accumulator sample period to the + PTSC period +* Jmax: max compute unit accumulated power which is indicated by + MaxCpuSwPwrAcc MSR C001007b +* Jx/Jy: compute unit accumulated power which is indicated by + CpuSwPwrAcc MSR C001007a +* Tx/Ty: the value of performance timestamp counter which is indicated + by CU_PTSC MSR C0010280 +* PwrCPUave: CPU average power + +i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007. + N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]]. + +ii. Read the full range of the cumulative energy value from the new +MSR MaxCpuSwPwrAcc. + Jmax = value returned. +iii. At time x, SW reads CpuSwPwrAcc MSR and samples the PTSC. + Jx = value read from CpuSwPwrAcc and Tx = value read from +PTSC. + +iv. At time y, SW reads CpuSwPwrAcc MSR and samples the PTSC. + Jy = value read from CpuSwPwrAcc and Ty = value read from +PTSC. + +v. Calculate the average power consumption for a compute unit over +time period (y-x). Unit of result is uWatt. + if (Jy < Jx) // Rollover has occurred + Jdelta = (Jy + Jmax) - Jx + else + Jdelta = Jy - Jx + PwrCPUave = N * Jdelta * 1000 / (Ty - Tx) + +This driver provides PwrCPUave and interval(default is 10 millisecond +and maximum is 1 second): +* power1_average (PwrCPUave) +* power1_average_interval (Interval) + +The power1_average_interval can be updated at /etc/sensors3.conf file +as below: + +chip "fam15h_power-*" + set power1_average_interval 0.01 + +Then save it with "sensors -s". diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 5abbfa8..7d9d697 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -1,7 +1,7 @@ /* * fam15h_power.c - AMD Family 15h processor power monitoring * - * Copyright (c) 2011 Advanced Micro Devices, Inc. + * Copyright (c) 2011-2016 Advanced Micro Devices, Inc. * Author: Andreas Herrmann * * -- 1.9.1
[PATCH v6 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm
This patch introduces an algorithm that computes the average power by reading a delta value of “core power accumulator” register during measurement interval, and then dividing delta value by the length of the time interval. User is able to use power1_average entry to measure the processor power consumption and power1_average_interval entry to set the interval. A simple example: ray@hr-ub:~/tip$ sensors fam15h_power-pci-00c4 Adapter: PCI adapter power1: 19.58 mW (avg = 2.55 mW, interval = 0.01 s) (crit = 15.00 W) ... The result is current average processor power consumption in 10 millisecond. The unit of the result is uWatt. Suggested-by: Guenter Roeck Signed-off-by: Huang Rui Cc: Borislav Petkov --- drivers/hwmon/fam15h_power.c | 128 +-- 1 file changed, 124 insertions(+), 4 deletions(-) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 336d422..5abbfa8 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -27,6 +27,8 @@ #include #include #include +#include +#include #include #include @@ -48,6 +50,9 @@ MODULE_LICENSE("GPL"); #define FAM15H_NUM_GROUPS 2 #define MAX_CUS8 +/* set maximum interval as 1 second */ +#define MAX_INTERVAL 1000 + #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b #define MSR_F15H_PTSC 0xc0010280 @@ -68,6 +73,9 @@ struct fam15h_power_data { u64 cu_acc_power[MAX_CUS]; /* performance timestamp counter */ u64 cpu_sw_pwr_ptsc[MAX_CUS]; + /* online/offline status of current compute unit */ + int cu_on[MAX_CUS]; + unsigned long power_period; }; static ssize_t show_power(struct device *dev, @@ -149,6 +157,8 @@ static void do_read_registers_on_cu(void *_data) rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]); + + data->cu_on[cu] = 1; } /* @@ -165,6 +175,8 @@ static int read_registers(struct fam15h_power_data *data) if (!ret) return -ENOMEM; + memset(data->cu_on, 0, sizeof(int) * MAX_CUS); + get_online_cpus(); this_cpu = smp_processor_id(); @@ -199,6 +211,98 @@ static int read_registers(struct fam15h_power_data *data) return 0; } +static ssize_t acc_show_power(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct fam15h_power_data *data = dev_get_drvdata(dev); + u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS], + jdelta[MAX_CUS]; + u64 tdelta, avg_acc; + int cu, cu_num, ret; + signed long leftover; + + /* +* With the new x86 topology modelling, x86_max_cores is the +* compute unit number. +*/ + cu_num = boot_cpu_data.x86_max_cores; + + ret = read_registers(data); + if (ret) + return 0; + + for (cu = 0; cu < cu_num; cu++) { + prev_cu_acc_power[cu] = data->cu_acc_power[cu]; + prev_ptsc[cu] = data->cpu_sw_pwr_ptsc[cu]; + } + + leftover = schedule_timeout_interruptible(msecs_to_jiffies(data->power_period)); + if (leftover) + return 0; + + ret = read_registers(data); + if (ret) + return 0; + + for (cu = 0, avg_acc = 0; cu < cu_num; cu++) { + /* check if current compute unit is online */ + if (data->cu_on[cu] == 0) + continue; + + if (data->cu_acc_power[cu] < prev_cu_acc_power[cu]) { + jdelta[cu] = data->max_cu_acc_power + data->cu_acc_power[cu]; + jdelta[cu] -= prev_cu_acc_power[cu]; + } else { + jdelta[cu] = data->cu_acc_power[cu] - prev_cu_acc_power[cu]; + } + tdelta = data->cpu_sw_pwr_ptsc[cu] - prev_ptsc[cu]; + jdelta[cu] *= data->cpu_pwr_sample_ratio * 1000; + do_div(jdelta[cu], tdelta); + + /* the unit is microWatt */ + avg_acc += jdelta[cu]; + } + + return sprintf(buf, "%llu\n", (unsigned long long)avg_acc); +} +static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL); + +static ssize_t acc_show_power_period(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct fam15h_power_data *data = dev_get_drvdata(dev); + + return sprintf(buf, "%lu\n", data->power_period); +} + +static ssize_t acc_set_power_period(struct device *dev, + struct device_attribute *attr,
[PATCH v6 6/6] hwmon: (fam15h_power) Add platform check function
This patch adds a platform check function to make code more readable. Signed-off-by: Huang Rui --- drivers/hwmon/fam15h_power.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 7d9d697..eb97a92 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -78,6 +78,11 @@ struct fam15h_power_data { unsigned long power_period; }; +static bool is_carrizo_or_later(void) +{ + return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60; +} + static ssize_t show_power(struct device *dev, struct device_attribute *attr, char *buf) { @@ -94,7 +99,7 @@ static ssize_t show_power(struct device *dev, * On Carrizo and later platforms, TdpRunAvgAccCap bit field * is extended to 4:31 from 4:25. */ - if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60) { + if (is_carrizo_or_later()) { running_avg_capture = val >> 4; running_avg_capture = sign_extend32(running_avg_capture, 27); } else { @@ -111,7 +116,7 @@ static ssize_t show_power(struct device *dev, * On Carrizo and later platforms, ApmTdpLimit bit field * is extended to 16:31 from 16:28. */ - if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60) + if (is_carrizo_or_later()) tdp_limit = val >> 16; else tdp_limit = (val >> 16) & 0x1fff; -- 1.9.1
[PATCH v6 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm
Hi Guenter, This serial of patches introduces an accumulated power reporting algorithm. It will calculate the average power consumption for the processor. The cpu feature flag is CPUID.8000_0007H:EDX[12]. This algorithm is used to test the comparison of processor power consumption with between MWAITX delay and TSC delay on AMD Carrizo platforms. Reference: https://lkml.kernel.org/r/1438744732-1459-1-git-send-email-ray.hu...@amd.com Commit f96756 at tip ("x86/asm: Add MONITORX/MWAITX instruction support") Commit b466bd at tip ("x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer") V1: https://lkml.kernel.org/r/1440662866-28716-1-git-send-email-ray.hu...@amd.com V2: https://lkml.kernel.org/r/1445308109-17970-1-git-send-email-ray.hu...@amd.com V3: https://lkml.kernel.org/r/1446199024-1896-1-git-send-email-ray.hu...@amd.com V4: https://lkml.kernel.org/r/1457662670-3354-1-git-send-email-ray.hu...@amd.com V5: https://lkml.kernel.org/r/1459143136-2412-1-git-send-email-ray.hu...@amd.com On V5, I used the new x86 topology with compute unit id (cpu_core_id) and compute unit numbers (x86_max_cores) for AMD processors. The change and documentation will go to master soon as below link. And This way is better to use smp_num_siblings which is un-defined variable out of CONFIG_SMP. On V6, Guenter, I suggest that you merge the v4.6-rc2 into your hwmon-next branch. Because there are two commits (ee6825, 8196da) which are the dependences of this patch set to fix the AMD compute unit and max cores things. And you can see the new topology info at Documentation/x86/topology.txt. Changes from v1 -> v2: - Move fam15h_power_groups and fam15h_power_group into fam15h_power_data to avoid overwrite on multi-CPU system. - Rename FAM15H_MIN_NUM_ATTRS macro and fix return error code. - Remove unnecessary warning print. - Adds do_read_registers_on_cu to do all the read to all MSRs and run it on one of the online cores on each compute unit with smp_call_function_many(). - Use power1_average and power1_average_interval standard entry instread of power1_acc - Fix the CPU-hotplug case. Changes from v2 -> v3: - As Guenter's suggestion, remove typecast, use >groups[0]. - Remove all "fam15_power_*" prefix at data. - Remove unnecessary ( ). - Fix the issue that is reported by build test robot, and add CPU_SUP_AMD as the dependence of fam15h_power - Remove the WARN_ON at do_read_registers_on_cu, because it must be behind CPUID check. The MSR must be available since CPUID.8000_0007H:EDX[12] is set - Add get_online_cpus()/put_online_cpus() functions. - Refine commments and the method which generate cpumask for cu. - Add the interval scope to make the value suitable for user experience - Remove the useless mutex. Changes from v3 -> v4: - Rebase the patches to latest groeck/hwmon-next. - Use smp_num_siblings instead of cores_per_cu accessor. - Refine the cpumask method which is inspired by perf solution. - Fix some typo and errors. Changes from v4 -> v5: - Rebase patches to v4.6-rc1. - Use new x86 topology with compute id (cpu_core_id) and compute unit number (x86_max_cores) instead of smp_num_siblings. Changes from v5 -> v6: - Remove unuse get_cpu/put_cpu. - Update cpumask handling method with new topology. - Use X86_FEATURE_ACC_POWER flag instead cpuid - Add comments to explain default interval value and add documentation to explain how to modify the interval. A simple example: ray@hr-ub:~/tip$ sensors fam15h_power-pci-00c4 Adapter: PCI adapter power1: 19.58 mW (avg = 2.55 mW, interval = 0.01 s) (crit = 15.00 W) ... These patches are rebased on v4.6-rc2. Thanks, Rui Huang Rui (6): hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence hwmon: (fam15h_power) Add compute unit accumulated power hwmon: (fam15h_power) Add ptsc counter value for accumulated power hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm hwmon: (fam15h_power) Add platform check function Documentation/hwmon/fam15h_power | 65 +++- drivers/hwmon/Kconfig| 2 +- drivers/hwmon/fam15h_power.c | 215 +-- 3 files changed, 272 insertions(+), 10 deletions(-) -- 1.9.1
[PATCH v6 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm
Hi Guenter, This serial of patches introduces an accumulated power reporting algorithm. It will calculate the average power consumption for the processor. The cpu feature flag is CPUID.8000_0007H:EDX[12]. This algorithm is used to test the comparison of processor power consumption with between MWAITX delay and TSC delay on AMD Carrizo platforms. Reference: https://lkml.kernel.org/r/1438744732-1459-1-git-send-email-ray.hu...@amd.com Commit f96756 at tip ("x86/asm: Add MONITORX/MWAITX instruction support") Commit b466bd at tip ("x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer") V1: https://lkml.kernel.org/r/1440662866-28716-1-git-send-email-ray.hu...@amd.com V2: https://lkml.kernel.org/r/1445308109-17970-1-git-send-email-ray.hu...@amd.com V3: https://lkml.kernel.org/r/1446199024-1896-1-git-send-email-ray.hu...@amd.com V4: https://lkml.kernel.org/r/1457662670-3354-1-git-send-email-ray.hu...@amd.com V5: https://lkml.kernel.org/r/1459143136-2412-1-git-send-email-ray.hu...@amd.com On V5, I used the new x86 topology with compute unit id (cpu_core_id) and compute unit numbers (x86_max_cores) for AMD processors. The change and documentation will go to master soon as below link. And This way is better to use smp_num_siblings which is un-defined variable out of CONFIG_SMP. On V6, Guenter, I suggest that you merge the v4.6-rc2 into your hwmon-next branch. Because there are two commits (ee6825, 8196da) which are the dependences of this patch set to fix the AMD compute unit and max cores things. And you can see the new topology info at Documentation/x86/topology.txt. Changes from v1 -> v2: - Move fam15h_power_groups and fam15h_power_group into fam15h_power_data to avoid overwrite on multi-CPU system. - Rename FAM15H_MIN_NUM_ATTRS macro and fix return error code. - Remove unnecessary warning print. - Adds do_read_registers_on_cu to do all the read to all MSRs and run it on one of the online cores on each compute unit with smp_call_function_many(). - Use power1_average and power1_average_interval standard entry instread of power1_acc - Fix the CPU-hotplug case. Changes from v2 -> v3: - As Guenter's suggestion, remove typecast, use >groups[0]. - Remove all "fam15_power_*" prefix at data. - Remove unnecessary ( ). - Fix the issue that is reported by build test robot, and add CPU_SUP_AMD as the dependence of fam15h_power - Remove the WARN_ON at do_read_registers_on_cu, because it must be behind CPUID check. The MSR must be available since CPUID.8000_0007H:EDX[12] is set - Add get_online_cpus()/put_online_cpus() functions. - Refine commments and the method which generate cpumask for cu. - Add the interval scope to make the value suitable for user experience - Remove the useless mutex. Changes from v3 -> v4: - Rebase the patches to latest groeck/hwmon-next. - Use smp_num_siblings instead of cores_per_cu accessor. - Refine the cpumask method which is inspired by perf solution. - Fix some typo and errors. Changes from v4 -> v5: - Rebase patches to v4.6-rc1. - Use new x86 topology with compute id (cpu_core_id) and compute unit number (x86_max_cores) instead of smp_num_siblings. Changes from v5 -> v6: - Remove unuse get_cpu/put_cpu. - Update cpumask handling method with new topology. - Use X86_FEATURE_ACC_POWER flag instead cpuid - Add comments to explain default interval value and add documentation to explain how to modify the interval. A simple example: ray@hr-ub:~/tip$ sensors fam15h_power-pci-00c4 Adapter: PCI adapter power1: 19.58 mW (avg = 2.55 mW, interval = 0.01 s) (crit = 15.00 W) ... These patches are rebased on v4.6-rc2. Thanks, Rui Huang Rui (6): hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence hwmon: (fam15h_power) Add compute unit accumulated power hwmon: (fam15h_power) Add ptsc counter value for accumulated power hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm hwmon: (fam15h_power) Add platform check function Documentation/hwmon/fam15h_power | 65 +++- drivers/hwmon/Kconfig| 2 +- drivers/hwmon/fam15h_power.c | 215 +-- 3 files changed, 272 insertions(+), 10 deletions(-) -- 1.9.1
[PATCH v6 1/6] hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence
This patch adds CONFIG_CPU_SUP_AMD as the dependence of fam15h_power driver. Because the following patch will use the interface from x86/kernel/cpu/amd.c. Otherwise, the below error might be encountered: All errors (new ones prefixed by >>): drivers/built-in.o: In function `fam15h_power_probe': >> fam15h_power.c:(.text+0x26e3a3): undefined reference to >> `amd_get_cores_per_cu' fam15h_power.c:(.text+0x26e41e): undefined reference to `amd_get_cores_per_cu' Reported-by: build test robot <l...@intel.com> Acked-by: Borislav Petkov <b...@suse.de> Signed-off-by: Huang Rui <ray.hu...@amd.com> --- drivers/hwmon/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig index 5c2d13a..4be3792 100644 --- a/drivers/hwmon/Kconfig +++ b/drivers/hwmon/Kconfig @@ -288,7 +288,7 @@ config SENSORS_K10TEMP config SENSORS_FAM15H_POWER tristate "AMD Family 15h processor power" - depends on X86 && PCI + depends on X86 && PCI && CPU_SUP_AMD help If you say yes here you get support for processor power information of your AMD family 15h CPU. -- 1.9.1
[PATCH v6 1/6] hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence
This patch adds CONFIG_CPU_SUP_AMD as the dependence of fam15h_power driver. Because the following patch will use the interface from x86/kernel/cpu/amd.c. Otherwise, the below error might be encountered: All errors (new ones prefixed by >>): drivers/built-in.o: In function `fam15h_power_probe': >> fam15h_power.c:(.text+0x26e3a3): undefined reference to >> `amd_get_cores_per_cu' fam15h_power.c:(.text+0x26e41e): undefined reference to `amd_get_cores_per_cu' Reported-by: build test robot Acked-by: Borislav Petkov Signed-off-by: Huang Rui --- drivers/hwmon/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig index 5c2d13a..4be3792 100644 --- a/drivers/hwmon/Kconfig +++ b/drivers/hwmon/Kconfig @@ -288,7 +288,7 @@ config SENSORS_K10TEMP config SENSORS_FAM15H_POWER tristate "AMD Family 15h processor power" - depends on X86 && PCI + depends on X86 && PCI && CPU_SUP_AMD help If you say yes here you get support for processor power information of your AMD family 15h CPU. -- 1.9.1
[tip:perf/core] perf/x86/msr: Add AMD PTSC (Performance Time-Stamp Counter) support
Commit-ID: 8a22426184774d7ced9c1d3aa4d95d34101fb3be Gitweb: http://git.kernel.org/tip/8a22426184774d7ced9c1d3aa4d95d34101fb3be Author: Huang Rui <ray.hu...@amd.com> AuthorDate: Fri, 29 Jan 2016 16:29:56 +0800 Committer: Ingo Molnar <mi...@kernel.org> CommitDate: Thu, 31 Mar 2016 10:30:39 +0200 perf/x86/msr: Add AMD PTSC (Performance Time-Stamp Counter) support AMD Carrizo (Family 15h, Model 60h) introduces a time-stamp counter which is indicated by CPUID.8000_0001H:ECX[27]. It increments at a 100 MHz rate in all P-states, and C states, S0, or S1. The frequency is about 100MHz. This counter will be used to calculate processor power and other parts. So add an interface into the MSR PMU to get the PTSC counter value. Signed-off-by: Huang Rui <ray.hu...@amd.com> Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> Cc: Alexander Shishkin <alexander.shish...@linux.intel.com> Cc: Andy Lutomirski <l...@amacapital.net> Cc: Aravind Gopalakrishnan <aravind.gopalakrish...@amd.com> Cc: Arnaldo Carvalho de Melo <a...@kernel.org> Cc: Arnaldo Carvalho de Melo <a...@redhat.com> Cc: Borislav Petkov <b...@alien8.de> Cc: Borislav Petkov <b...@suse.de> Cc: Fengguang Wu <fengguang...@intel.com> Cc: Jacob Shin <jacob.w.s...@gmail.com> Cc: Jiri Olsa <jo...@redhat.com> Cc: Kan Liang <kan.li...@intel.com> Cc: Linus Torvalds <torva...@linux-foundation.org> Cc: Peter Zijlstra <pet...@infradead.org> Cc: Robert Richter <r...@kernel.org> Cc: Stephane Eranian <eran...@google.com> Cc: Suravee Suthikulpanit <suravee.suthikulpa...@amd.com> Cc: Thomas Gleixner <t...@linutronix.de> Cc: Vince Weaver <vincent.wea...@maine.edu> Link: http://lkml.kernel.org/r/1454056197-5893-2-git-send-email-ray.hu...@amd.com Signed-off-by: Ingo Molnar <mi...@kernel.org> --- arch/x86/events/msr.c | 8 arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 1 + 3 files changed, 10 insertions(+) diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c index ec863b9..6f6772f 100644 --- a/arch/x86/events/msr.c +++ b/arch/x86/events/msr.c @@ -6,6 +6,7 @@ enum perf_msr_id { PERF_MSR_MPERF = 2, PERF_MSR_PPERF = 3, PERF_MSR_SMI= 4, + PERF_MSR_PTSC = 5, PERF_MSR_EVENT_MAX, }; @@ -15,6 +16,11 @@ static bool test_aperfmperf(int idx) return boot_cpu_has(X86_FEATURE_APERFMPERF); } +static bool test_ptsc(int idx) +{ + return boot_cpu_has(X86_FEATURE_PTSC); +} + static bool test_intel(int idx) { if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL || @@ -74,6 +80,7 @@ PMU_EVENT_ATTR_STRING(aperf, evattr_aperf, "event=0x01"); PMU_EVENT_ATTR_STRING(mperf, evattr_mperf, "event=0x02"); PMU_EVENT_ATTR_STRING(pperf, evattr_pperf, "event=0x03"); PMU_EVENT_ATTR_STRING(smi, evattr_smi, "event=0x04"); +PMU_EVENT_ATTR_STRING(ptsc, evattr_ptsc, "event=0x05"); static struct perf_msr msr[] = { [PERF_MSR_TSC] = { 0, _tsc,NULL, }, @@ -81,6 +88,7 @@ static struct perf_msr msr[] = { [PERF_MSR_MPERF] = { MSR_IA32_MPERF,_mperf, test_aperfmperf, }, [PERF_MSR_PPERF] = { MSR_PPERF, _pperf, test_intel, }, [PERF_MSR_SMI] = { MSR_SMI_COUNT, _smi,test_intel, }, + [PERF_MSR_PTSC] = { MSR_F15H_PTSC,_ptsc, test_ptsc, }, }; static struct attribute *events_attrs[PERF_MSR_EVENT_MAX + 1] = { diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 44ebd04..bdf9042 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -177,6 +177,7 @@ #define X86_FEATURE_PERFCTR_CORE ( 6*32+23) /* core performance counter extensions */ #define X86_FEATURE_PERFCTR_NB ( 6*32+24) /* NB performance counter extensions */ #define X86_FEATURE_BPEXT (6*32+26) /* data breakpoint extension */ +#define X86_FEATURE_PTSC ( 6*32+27) /* performance time-stamp counter */ #define X86_FEATURE_PERFCTR_L2 ( 6*32+28) /* L2 performance counter extensions */ #define X86_FEATURE_MWAITX ( 6*32+29) /* MWAIT extension (MONITORX/MWAITX) */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 984ab75..6e6a5cc 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -326,6 +326,7 @@ #define MSR_F15H_PERF_CTR 0xc0010201 #define MSR_F15H_NB_PERF_CTL 0xc0010240 #define MSR_F15H_NB_PERF_CTR 0xc0010241 +#define MSR_F15H_PTSC 0xc0010280 #define MSR_F15H_IC_CFG0xc0011021 /* Fam 10h MSRs */
[tip:perf/core] perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter
Commit-ID: aaf248848db503927644d28e239bc399ed45959f Gitweb: http://git.kernel.org/tip/aaf248848db503927644d28e239bc399ed45959f Author: Huang Rui <ray.hu...@amd.com> AuthorDate: Fri, 29 Jan 2016 16:29:57 +0800 Committer: Ingo Molnar <mi...@kernel.org> CommitDate: Thu, 31 Mar 2016 10:30:39 +0200 perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter AMD Zeppelin (Family 17h, Model 00h) introduces an instructions retired performance counter which is indicated by CPUID.8000_0008H:EBX[1]. A dedicated Instructions Retired MSR register (MSR 0xC000_000E9) increments once for every instruction retired. Signed-off-by: Huang Rui <ray.hu...@amd.com> Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> Cc: Alexander Shishkin <alexander.shish...@linux.intel.com> Cc: Andy Lutomirski <l...@amacapital.net> Cc: Aravind Gopalakrishnan <aravind.gopalakrish...@amd.com> Cc: Arnaldo Carvalho de Melo <a...@kernel.org> Cc: Arnaldo Carvalho de Melo <a...@redhat.com> Cc: Borislav Petkov <b...@alien8.de> Cc: Borislav Petkov <b...@suse.de> Cc: Fengguang Wu <fengguang...@intel.com> Cc: Jacob Shin <jacob.w.s...@gmail.com> Cc: Jiri Olsa <jo...@redhat.com> Cc: Kan Liang <kan.li...@intel.com> Cc: Linus Torvalds <torva...@linux-foundation.org> Cc: Peter Zijlstra <pet...@infradead.org> Cc: Robert Richter <r...@kernel.org> Cc: Stephane Eranian <eran...@google.com> Cc: Suravee Suthikulpanit <suravee.suthikulpa...@amd.com> Cc: Thomas Gleixner <t...@linutronix.de> Cc: Vince Weaver <vincent.wea...@maine.edu> Link: http://lkml.kernel.org/r/1454056197-5893-3-git-send-email-ray.hu...@amd.com Signed-off-by: Ingo Molnar <mi...@kernel.org> --- arch/x86/events/msr.c | 30 +++--- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 3 +++ 3 files changed, 23 insertions(+), 11 deletions(-) diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c index 6f6772f..7111400 100644 --- a/arch/x86/events/msr.c +++ b/arch/x86/events/msr.c @@ -7,6 +7,7 @@ enum perf_msr_id { PERF_MSR_PPERF = 3, PERF_MSR_SMI= 4, PERF_MSR_PTSC = 5, + PERF_MSR_IRPERF = 6, PERF_MSR_EVENT_MAX, }; @@ -21,6 +22,11 @@ static bool test_ptsc(int idx) return boot_cpu_has(X86_FEATURE_PTSC); } +static bool test_irperf(int idx) +{ + return boot_cpu_has(X86_FEATURE_IRPERF); +} + static bool test_intel(int idx) { if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL || @@ -75,20 +81,22 @@ struct perf_msr { bool(*test)(int idx); }; -PMU_EVENT_ATTR_STRING(tsc, evattr_tsc, "event=0x00"); -PMU_EVENT_ATTR_STRING(aperf, evattr_aperf, "event=0x01"); -PMU_EVENT_ATTR_STRING(mperf, evattr_mperf, "event=0x02"); -PMU_EVENT_ATTR_STRING(pperf, evattr_pperf, "event=0x03"); -PMU_EVENT_ATTR_STRING(smi, evattr_smi, "event=0x04"); -PMU_EVENT_ATTR_STRING(ptsc, evattr_ptsc, "event=0x05"); +PMU_EVENT_ATTR_STRING(tsc,evattr_tsc,"event=0x00"); +PMU_EVENT_ATTR_STRING(aperf, evattr_aperf, "event=0x01"); +PMU_EVENT_ATTR_STRING(mperf, evattr_mperf, "event=0x02"); +PMU_EVENT_ATTR_STRING(pperf, evattr_pperf, "event=0x03"); +PMU_EVENT_ATTR_STRING(smi,evattr_smi,"event=0x04"); +PMU_EVENT_ATTR_STRING(ptsc, evattr_ptsc, "event=0x05"); +PMU_EVENT_ATTR_STRING(irperf, evattr_irperf, "event=0x06"); static struct perf_msr msr[] = { - [PERF_MSR_TSC] = { 0, _tsc,NULL, }, - [PERF_MSR_APERF] = { MSR_IA32_APERF,_aperf, test_aperfmperf, }, - [PERF_MSR_MPERF] = { MSR_IA32_MPERF,_mperf, test_aperfmperf, }, - [PERF_MSR_PPERF] = { MSR_PPERF, _pperf, test_intel, }, - [PERF_MSR_SMI] = { MSR_SMI_COUNT, _smi,test_intel, }, + [PERF_MSR_TSC]= { 0,_tsc,NULL, }, + [PERF_MSR_APERF] = { MSR_IA32_APERF, _aperf, test_aperfmperf, }, + [PERF_MSR_MPERF] = { MSR_IA32_MPERF, _mperf, test_aperfmperf, }, + [PERF_MSR_PPERF] = { MSR_PPERF,_pperf, test_intel, }, + [PERF_MSR_SMI]= { MSR_SMI_COUNT,_smi,test_intel, }, [PERF_MSR_PTSC] = { MSR_F15H_PTSC,_ptsc, test_ptsc, }, + [PERF_MSR_IRPERF] = { MSR_F17H_IRPERF, _irperf, test_irperf, }, }; static struct attribute *events_attrs[PERF_MSR_EVENT_MAX + 1] = { diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index bdf9042..dd448a9 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -251,6 +251,7 @@ /* AMD-defined CPU features, CPUID
[tip:perf/core] perf/x86/msr: Add AMD PTSC (Performance Time-Stamp Counter) support
Commit-ID: 8a22426184774d7ced9c1d3aa4d95d34101fb3be Gitweb: http://git.kernel.org/tip/8a22426184774d7ced9c1d3aa4d95d34101fb3be Author: Huang Rui AuthorDate: Fri, 29 Jan 2016 16:29:56 +0800 Committer: Ingo Molnar CommitDate: Thu, 31 Mar 2016 10:30:39 +0200 perf/x86/msr: Add AMD PTSC (Performance Time-Stamp Counter) support AMD Carrizo (Family 15h, Model 60h) introduces a time-stamp counter which is indicated by CPUID.8000_0001H:ECX[27]. It increments at a 100 MHz rate in all P-states, and C states, S0, or S1. The frequency is about 100MHz. This counter will be used to calculate processor power and other parts. So add an interface into the MSR PMU to get the PTSC counter value. Signed-off-by: Huang Rui Signed-off-by: Peter Zijlstra (Intel) Cc: Alexander Shishkin Cc: Andy Lutomirski Cc: Aravind Gopalakrishnan Cc: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo Cc: Borislav Petkov Cc: Borislav Petkov Cc: Fengguang Wu Cc: Jacob Shin Cc: Jiri Olsa Cc: Kan Liang Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Robert Richter Cc: Stephane Eranian Cc: Suravee Suthikulpanit Cc: Thomas Gleixner Cc: Vince Weaver Link: http://lkml.kernel.org/r/1454056197-5893-2-git-send-email-ray.hu...@amd.com Signed-off-by: Ingo Molnar --- arch/x86/events/msr.c | 8 arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 1 + 3 files changed, 10 insertions(+) diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c index ec863b9..6f6772f 100644 --- a/arch/x86/events/msr.c +++ b/arch/x86/events/msr.c @@ -6,6 +6,7 @@ enum perf_msr_id { PERF_MSR_MPERF = 2, PERF_MSR_PPERF = 3, PERF_MSR_SMI= 4, + PERF_MSR_PTSC = 5, PERF_MSR_EVENT_MAX, }; @@ -15,6 +16,11 @@ static bool test_aperfmperf(int idx) return boot_cpu_has(X86_FEATURE_APERFMPERF); } +static bool test_ptsc(int idx) +{ + return boot_cpu_has(X86_FEATURE_PTSC); +} + static bool test_intel(int idx) { if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL || @@ -74,6 +80,7 @@ PMU_EVENT_ATTR_STRING(aperf, evattr_aperf, "event=0x01"); PMU_EVENT_ATTR_STRING(mperf, evattr_mperf, "event=0x02"); PMU_EVENT_ATTR_STRING(pperf, evattr_pperf, "event=0x03"); PMU_EVENT_ATTR_STRING(smi, evattr_smi, "event=0x04"); +PMU_EVENT_ATTR_STRING(ptsc, evattr_ptsc, "event=0x05"); static struct perf_msr msr[] = { [PERF_MSR_TSC] = { 0, _tsc,NULL, }, @@ -81,6 +88,7 @@ static struct perf_msr msr[] = { [PERF_MSR_MPERF] = { MSR_IA32_MPERF,_mperf, test_aperfmperf, }, [PERF_MSR_PPERF] = { MSR_PPERF, _pperf, test_intel, }, [PERF_MSR_SMI] = { MSR_SMI_COUNT, _smi,test_intel, }, + [PERF_MSR_PTSC] = { MSR_F15H_PTSC,_ptsc, test_ptsc, }, }; static struct attribute *events_attrs[PERF_MSR_EVENT_MAX + 1] = { diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 44ebd04..bdf9042 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -177,6 +177,7 @@ #define X86_FEATURE_PERFCTR_CORE ( 6*32+23) /* core performance counter extensions */ #define X86_FEATURE_PERFCTR_NB ( 6*32+24) /* NB performance counter extensions */ #define X86_FEATURE_BPEXT (6*32+26) /* data breakpoint extension */ +#define X86_FEATURE_PTSC ( 6*32+27) /* performance time-stamp counter */ #define X86_FEATURE_PERFCTR_L2 ( 6*32+28) /* L2 performance counter extensions */ #define X86_FEATURE_MWAITX ( 6*32+29) /* MWAIT extension (MONITORX/MWAITX) */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 984ab75..6e6a5cc 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -326,6 +326,7 @@ #define MSR_F15H_PERF_CTR 0xc0010201 #define MSR_F15H_NB_PERF_CTL 0xc0010240 #define MSR_F15H_NB_PERF_CTR 0xc0010241 +#define MSR_F15H_PTSC 0xc0010280 #define MSR_F15H_IC_CFG0xc0011021 /* Fam 10h MSRs */
[tip:perf/core] perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter
Commit-ID: aaf248848db503927644d28e239bc399ed45959f Gitweb: http://git.kernel.org/tip/aaf248848db503927644d28e239bc399ed45959f Author: Huang Rui AuthorDate: Fri, 29 Jan 2016 16:29:57 +0800 Committer: Ingo Molnar CommitDate: Thu, 31 Mar 2016 10:30:39 +0200 perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter AMD Zeppelin (Family 17h, Model 00h) introduces an instructions retired performance counter which is indicated by CPUID.8000_0008H:EBX[1]. A dedicated Instructions Retired MSR register (MSR 0xC000_000E9) increments once for every instruction retired. Signed-off-by: Huang Rui Signed-off-by: Peter Zijlstra (Intel) Cc: Alexander Shishkin Cc: Andy Lutomirski Cc: Aravind Gopalakrishnan Cc: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo Cc: Borislav Petkov Cc: Borislav Petkov Cc: Fengguang Wu Cc: Jacob Shin Cc: Jiri Olsa Cc: Kan Liang Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Robert Richter Cc: Stephane Eranian Cc: Suravee Suthikulpanit Cc: Thomas Gleixner Cc: Vince Weaver Link: http://lkml.kernel.org/r/1454056197-5893-3-git-send-email-ray.hu...@amd.com Signed-off-by: Ingo Molnar --- arch/x86/events/msr.c | 30 +++--- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 3 +++ 3 files changed, 23 insertions(+), 11 deletions(-) diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c index 6f6772f..7111400 100644 --- a/arch/x86/events/msr.c +++ b/arch/x86/events/msr.c @@ -7,6 +7,7 @@ enum perf_msr_id { PERF_MSR_PPERF = 3, PERF_MSR_SMI= 4, PERF_MSR_PTSC = 5, + PERF_MSR_IRPERF = 6, PERF_MSR_EVENT_MAX, }; @@ -21,6 +22,11 @@ static bool test_ptsc(int idx) return boot_cpu_has(X86_FEATURE_PTSC); } +static bool test_irperf(int idx) +{ + return boot_cpu_has(X86_FEATURE_IRPERF); +} + static bool test_intel(int idx) { if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL || @@ -75,20 +81,22 @@ struct perf_msr { bool(*test)(int idx); }; -PMU_EVENT_ATTR_STRING(tsc, evattr_tsc, "event=0x00"); -PMU_EVENT_ATTR_STRING(aperf, evattr_aperf, "event=0x01"); -PMU_EVENT_ATTR_STRING(mperf, evattr_mperf, "event=0x02"); -PMU_EVENT_ATTR_STRING(pperf, evattr_pperf, "event=0x03"); -PMU_EVENT_ATTR_STRING(smi, evattr_smi, "event=0x04"); -PMU_EVENT_ATTR_STRING(ptsc, evattr_ptsc, "event=0x05"); +PMU_EVENT_ATTR_STRING(tsc,evattr_tsc,"event=0x00"); +PMU_EVENT_ATTR_STRING(aperf, evattr_aperf, "event=0x01"); +PMU_EVENT_ATTR_STRING(mperf, evattr_mperf, "event=0x02"); +PMU_EVENT_ATTR_STRING(pperf, evattr_pperf, "event=0x03"); +PMU_EVENT_ATTR_STRING(smi,evattr_smi,"event=0x04"); +PMU_EVENT_ATTR_STRING(ptsc, evattr_ptsc, "event=0x05"); +PMU_EVENT_ATTR_STRING(irperf, evattr_irperf, "event=0x06"); static struct perf_msr msr[] = { - [PERF_MSR_TSC] = { 0, _tsc,NULL, }, - [PERF_MSR_APERF] = { MSR_IA32_APERF,_aperf, test_aperfmperf, }, - [PERF_MSR_MPERF] = { MSR_IA32_MPERF,_mperf, test_aperfmperf, }, - [PERF_MSR_PPERF] = { MSR_PPERF, _pperf, test_intel, }, - [PERF_MSR_SMI] = { MSR_SMI_COUNT, _smi,test_intel, }, + [PERF_MSR_TSC]= { 0,_tsc,NULL, }, + [PERF_MSR_APERF] = { MSR_IA32_APERF, _aperf, test_aperfmperf, }, + [PERF_MSR_MPERF] = { MSR_IA32_MPERF, _mperf, test_aperfmperf, }, + [PERF_MSR_PPERF] = { MSR_PPERF,_pperf, test_intel, }, + [PERF_MSR_SMI]= { MSR_SMI_COUNT,_smi,test_intel, }, [PERF_MSR_PTSC] = { MSR_F15H_PTSC,_ptsc, test_ptsc, }, + [PERF_MSR_IRPERF] = { MSR_F17H_IRPERF, _irperf, test_irperf, }, }; static struct attribute *events_attrs[PERF_MSR_EVENT_MAX + 1] = { diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index bdf9042..dd448a9 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -251,6 +251,7 @@ /* AMD-defined CPU features, CPUID level 0x8008 (ebx), word 13 */ #define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */ +#define X86_FEATURE_IRPERF (13*32+1) /* Instructions Retired Count */ /* Thermal and Power Management Leaf, CPUID level 0x0006 (eax), word 14 */ #define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 6e6a5cc..e0e2f7d 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -313,6 +313,9 @@ #define MSR_AMD64_IBSOPDATA4 0xc001103d #define MSR_AMD64_IBS_REG_COUNT_MAX8 /* includes MSR_AM
[tip:x86/urgent] x86/cpu: Add advanced power management bits
Commit-ID: 34a4cceb78e48c75d1b48b25352a3f3b2cc2b2da Gitweb: http://git.kernel.org/tip/34a4cceb78e48c75d1b48b25352a3f3b2cc2b2da Author: Huang Rui <ray.hu...@amd.com> AuthorDate: Fri, 25 Mar 2016 10:08:40 +0800 Committer: Thomas Gleixner <t...@linutronix.de> CommitDate: Tue, 29 Mar 2016 11:12:11 +0200 x86/cpu: Add advanced power management bits Bit 11 of CPUID 8000_0007 edx is processor feedback interface. Bit 12 of CPUID 8000_0007 edx is accumulated power. Print proper names in proc/cpuinfo Reported-and-tested-by: Borislav Petkov <b...@alien8.de> Signed-off-by: Huang Rui <ray.hu...@amd.com> Cc: Tony Li <tony...@amd.com> Cc: Fenghua Yu <fenghua...@intel.com> Cc: Tony Luck <tony.l...@intel.com> Cc: Peter Zijlstra <pet...@infradead.org> Cc: "Rafael J. Wysocki" <r...@rjwysocki.net> Cc: Andy Lutomirski <l...@amacapital.net> Cc: Fengguang Wu <fengguang...@intel.com> Cc: Sherry Hurwitz <sherry.hurw...@amd.com> Cc: Borislav Petkov <b...@suse.de> Cc: "Len Brown" <l...@kernel.org> Link: http://lkml.kernel.org/r/1458871720-3209-1-git-send-email-ray.hu...@amd.com Signed-off-by: Thomas Gleixner <t...@linutronix.de> --- arch/x86/kernel/cpu/powerflags.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/cpu/powerflags.c b/arch/x86/kernel/cpu/powerflags.c index 31f0f33..1dd8294 100644 --- a/arch/x86/kernel/cpu/powerflags.c +++ b/arch/x86/kernel/cpu/powerflags.c @@ -18,4 +18,6 @@ const char *const x86_power_flags[32] = { "", /* tsc invariant mapped to constant_tsc */ "cpb", /* core performance boost */ "eff_freq_ro", /* Readonly aperf/mperf */ + "proc_feedback", /* processor feedback interface */ + "acc_power", /* accumulated power mechanism */ };
[tip:x86/urgent] x86/cpu: Add advanced power management bits
Commit-ID: 34a4cceb78e48c75d1b48b25352a3f3b2cc2b2da Gitweb: http://git.kernel.org/tip/34a4cceb78e48c75d1b48b25352a3f3b2cc2b2da Author: Huang Rui AuthorDate: Fri, 25 Mar 2016 10:08:40 +0800 Committer: Thomas Gleixner CommitDate: Tue, 29 Mar 2016 11:12:11 +0200 x86/cpu: Add advanced power management bits Bit 11 of CPUID 8000_0007 edx is processor feedback interface. Bit 12 of CPUID 8000_0007 edx is accumulated power. Print proper names in proc/cpuinfo Reported-and-tested-by: Borislav Petkov Signed-off-by: Huang Rui Cc: Tony Li Cc: Fenghua Yu Cc: Tony Luck Cc: Peter Zijlstra Cc: "Rafael J. Wysocki" Cc: Andy Lutomirski Cc: Fengguang Wu Cc: Sherry Hurwitz Cc: Borislav Petkov Cc: "Len Brown" Link: http://lkml.kernel.org/r/1458871720-3209-1-git-send-email-ray.hu...@amd.com Signed-off-by: Thomas Gleixner --- arch/x86/kernel/cpu/powerflags.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/cpu/powerflags.c b/arch/x86/kernel/cpu/powerflags.c index 31f0f33..1dd8294 100644 --- a/arch/x86/kernel/cpu/powerflags.c +++ b/arch/x86/kernel/cpu/powerflags.c @@ -18,4 +18,6 @@ const char *const x86_power_flags[32] = { "", /* tsc invariant mapped to constant_tsc */ "cpb", /* core performance boost */ "eff_freq_ro", /* Readonly aperf/mperf */ + "proc_feedback", /* processor feedback interface */ + "acc_power", /* accumulated power mechanism */ };
Re: [PATCH v5 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm
On Mon, Mar 28, 2016 at 11:33:27AM +0200, Borislav Petkov wrote: > On Mon, Mar 28, 2016 at 01:32:14PM +0800, Huang Rui wrote: > > > > return 0; > > @@ -330,6 +446,9 @@ static int fam15h_power_init_data(struct pci_dev *f4, > > > > data->max_cu_acc_power = tmp; > > > > + /* set default interval as 10 ms */ > > Because...? > I checked with HW designer, milliseconds is also a reasonable interval of acc power. And I cannot set too long here, because several seconds will cause the read function to hang for that period of time. So I pick 10ms here, and actually, we can update the interval at /etc/sensors3.conf chip "fam15h_power-*" set power1_average_interval 0.01 Thanks, Rui
Re: [PATCH v5 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm
On Mon, Mar 28, 2016 at 11:33:27AM +0200, Borislav Petkov wrote: > On Mon, Mar 28, 2016 at 01:32:14PM +0800, Huang Rui wrote: > > > > return 0; > > @@ -330,6 +446,9 @@ static int fam15h_power_init_data(struct pci_dev *f4, > > > > data->max_cu_acc_power = tmp; > > > > + /* set default interval as 10 ms */ > > Because...? > I checked with HW designer, milliseconds is also a reasonable interval of acc power. And I cannot set too long here, because several seconds will cause the read function to hang for that period of time. So I pick 10ms here, and actually, we can update the interval at /etc/sensors3.conf chip "fam15h_power-*" set power1_average_interval 0.01 Thanks, Rui
Re: [PATCH v5 2/6] hwmon: (fam15h_power) Add compute unit accumulated power
On Mon, Mar 28, 2016 at 11:29:52AM +0200, Borislav Petkov wrote: > On Mon, Mar 28, 2016 at 01:32:12PM +0800, Huang Rui wrote: > > + > > + get_online_cpus(); > > + this_cpu = get_cpu(); > > What now? > > get_online_cpus() is enough. > Will remove get_cpu(). > > + > > + /* > > +* Choose the first online core of each compute unit, and then > > +* read their MSR value of power and ptsc in a single IPI, > > +* because the MSR value of CPU core represent the compute > > +* unit's. > > +*/ > > + for_each_online_cpu(cpu) { > > + target = cpumask_first(topology_sibling_cpumask(cpu)); > > + if (!cpumask_test_cpu(target, mask)) > > + cpumask_set_cpu(target, mask); > > + } > > I think you want something like this: iterate over each core and put one > of them into the mask. > > core = -1; > > for_each_online_cpu(cpu) { > this_core = topology_core_id(cpu); > > if (this_core == core) > continue; > > core = this_core; > > /* get any CPU on this compute unit */ > cpumask_set_cpu(cpumask_any(topology_sibling_cpumask(cpu)), > mask); > } > Yep, with new x86 topology for core on AMD, using this way should be more clear. Thanks, Rui
Re: [PATCH v5 2/6] hwmon: (fam15h_power) Add compute unit accumulated power
On Mon, Mar 28, 2016 at 11:29:52AM +0200, Borislav Petkov wrote: > On Mon, Mar 28, 2016 at 01:32:12PM +0800, Huang Rui wrote: > > + > > + get_online_cpus(); > > + this_cpu = get_cpu(); > > What now? > > get_online_cpus() is enough. > Will remove get_cpu(). > > + > > + /* > > +* Choose the first online core of each compute unit, and then > > +* read their MSR value of power and ptsc in a single IPI, > > +* because the MSR value of CPU core represent the compute > > +* unit's. > > +*/ > > + for_each_online_cpu(cpu) { > > + target = cpumask_first(topology_sibling_cpumask(cpu)); > > + if (!cpumask_test_cpu(target, mask)) > > + cpumask_set_cpu(target, mask); > > + } > > I think you want something like this: iterate over each core and put one > of them into the mask. > > core = -1; > > for_each_online_cpu(cpu) { > this_core = topology_core_id(cpu); > > if (this_core == core) > continue; > > core = this_core; > > /* get any CPU on this compute unit */ > cpumask_set_cpu(cpumask_any(topology_sibling_cpumask(cpu)), > mask); > } > Yep, with new x86 topology for core on AMD, using this way should be more clear. Thanks, Rui
[PATCH v5 5/6] hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm
This patch adds the description to explain the TDP reporting mechanism and accumulated power algorithm. Signed-off-by: Huang Rui <ray.hu...@amd.com> Cc: Borislav Petkov <b...@alien8.de> --- Documentation/hwmon/fam15h_power | 57 +++- drivers/hwmon/fam15h_power.c | 2 +- 2 files changed, 57 insertions(+), 2 deletions(-) diff --git a/Documentation/hwmon/fam15h_power b/Documentation/hwmon/fam15h_power index e2b1b69..2c4fbee 100644 --- a/Documentation/hwmon/fam15h_power +++ b/Documentation/hwmon/fam15h_power @@ -10,14 +10,22 @@ Supported chips: Datasheets: BIOS and Kernel Developer's Guide (BKDG) For AMD Family 15h Processors BIOS and Kernel Developer's Guide (BKDG) For AMD Family 16h Processors + AMD64 Architecture Programmer's Manual Volume 2: System Programming Author: Andreas Herrmann <herrmann.der.u...@googlemail.com> Description --- +1) Processor TDP (Thermal design power) + +Given a fixed frequency and voltage, the power consumption of a +processor varies based on the workload being executed. Derated power +is the power consumed when running a specific application. Thermal +design power (TDP) is an example of derated power. + This driver permits reading of registers providing power information -of AMD Family 15h and 16h processors. +of AMD Family 15h and 16h processors via TDP algorithm. For AMD Family 15h and 16h processors the following power values can be calculated using different processor northbridge function @@ -37,3 +45,50 @@ This driver provides ProcessorPwrWatts and CurrPwrWatts: On multi-node processors the calculated value is for the entire package and not for a single node. Thus the driver creates sysfs attributes only for internal node0 of a multi-node processor. + +2) Accumulated Power Mechanism + +This driver also introduces an algorithm that should be used to +calculate the average power consumed by a processor during a +measurement interval Tm. The feature of accumulated power mechanism is +indicated by CPUID Fn8000_0007_EDX[12]. + +* Tsample: compute unit power accumulator sample period +* Tref: the PTSC counter period +* PTSC: performance timestamp counter +* N: the ratio of compute unit power accumulator sample period to the + PTSC period +* Jmax: max compute unit accumulated power which is indicated by + MaxCpuSwPwrAcc MSR C001007b +* Jx/Jy: compute unit accumulated power which is indicated by + CpuSwPwrAcc MSR C001007a +* Tx/Ty: the value of performance timestamp counter which is indicated + by CU_PTSC MSR C0010280 +* PwrCPUave: CPU average power + +i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007. + N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]]. + +ii. Read the full range of the cumulative energy value from the new +MSR MaxCpuSwPwrAcc. + Jmax = value returned. +iii. At time x, SW reads CpuSwPwrAcc MSR and samples the PTSC. + Jx = value read from CpuSwPwrAcc and Tx = value read from +PTSC. + +iv. At time y, SW reads CpuSwPwrAcc MSR and samples the PTSC. + Jy = value read from CpuSwPwrAcc and Ty = value read from +PTSC. + +v. Calculate the average power consumption for a compute unit over +time period (y-x). Unit of result is uWatt. + if (Jy < Jx) // Rollover has occurred + Jdelta = (Jy + Jmax) - Jx + else + Jdelta = Jy - Jx + PwrCPUave = N * Jdelta * 1000 / (Ty - Tx) + +This driver provides PwrCPUave and interval(default is 10 millisecond +and maximum is 1 second): +* power1_average (PwrCPUave) +* power1_average_interval (Interval) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 003564b..c1cad26 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -1,7 +1,7 @@ /* * fam15h_power.c - AMD Family 15h processor power monitoring * - * Copyright (c) 2011 Advanced Micro Devices, Inc. + * Copyright (c) 2011-2016 Advanced Micro Devices, Inc. * Author: Andreas Herrmann <herrmann.der.u...@googlemail.com> * * -- 1.9.1
[PATCH v5 5/6] hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm
This patch adds the description to explain the TDP reporting mechanism and accumulated power algorithm. Signed-off-by: Huang Rui Cc: Borislav Petkov --- Documentation/hwmon/fam15h_power | 57 +++- drivers/hwmon/fam15h_power.c | 2 +- 2 files changed, 57 insertions(+), 2 deletions(-) diff --git a/Documentation/hwmon/fam15h_power b/Documentation/hwmon/fam15h_power index e2b1b69..2c4fbee 100644 --- a/Documentation/hwmon/fam15h_power +++ b/Documentation/hwmon/fam15h_power @@ -10,14 +10,22 @@ Supported chips: Datasheets: BIOS and Kernel Developer's Guide (BKDG) For AMD Family 15h Processors BIOS and Kernel Developer's Guide (BKDG) For AMD Family 16h Processors + AMD64 Architecture Programmer's Manual Volume 2: System Programming Author: Andreas Herrmann Description --- +1) Processor TDP (Thermal design power) + +Given a fixed frequency and voltage, the power consumption of a +processor varies based on the workload being executed. Derated power +is the power consumed when running a specific application. Thermal +design power (TDP) is an example of derated power. + This driver permits reading of registers providing power information -of AMD Family 15h and 16h processors. +of AMD Family 15h and 16h processors via TDP algorithm. For AMD Family 15h and 16h processors the following power values can be calculated using different processor northbridge function @@ -37,3 +45,50 @@ This driver provides ProcessorPwrWatts and CurrPwrWatts: On multi-node processors the calculated value is for the entire package and not for a single node. Thus the driver creates sysfs attributes only for internal node0 of a multi-node processor. + +2) Accumulated Power Mechanism + +This driver also introduces an algorithm that should be used to +calculate the average power consumed by a processor during a +measurement interval Tm. The feature of accumulated power mechanism is +indicated by CPUID Fn8000_0007_EDX[12]. + +* Tsample: compute unit power accumulator sample period +* Tref: the PTSC counter period +* PTSC: performance timestamp counter +* N: the ratio of compute unit power accumulator sample period to the + PTSC period +* Jmax: max compute unit accumulated power which is indicated by + MaxCpuSwPwrAcc MSR C001007b +* Jx/Jy: compute unit accumulated power which is indicated by + CpuSwPwrAcc MSR C001007a +* Tx/Ty: the value of performance timestamp counter which is indicated + by CU_PTSC MSR C0010280 +* PwrCPUave: CPU average power + +i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007. + N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]]. + +ii. Read the full range of the cumulative energy value from the new +MSR MaxCpuSwPwrAcc. + Jmax = value returned. +iii. At time x, SW reads CpuSwPwrAcc MSR and samples the PTSC. + Jx = value read from CpuSwPwrAcc and Tx = value read from +PTSC. + +iv. At time y, SW reads CpuSwPwrAcc MSR and samples the PTSC. + Jy = value read from CpuSwPwrAcc and Ty = value read from +PTSC. + +v. Calculate the average power consumption for a compute unit over +time period (y-x). Unit of result is uWatt. + if (Jy < Jx) // Rollover has occurred + Jdelta = (Jy + Jmax) - Jx + else + Jdelta = Jy - Jx + PwrCPUave = N * Jdelta * 1000 / (Ty - Tx) + +This driver provides PwrCPUave and interval(default is 10 millisecond +and maximum is 1 second): +* power1_average (PwrCPUave) +* power1_average_interval (Interval) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 003564b..c1cad26 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -1,7 +1,7 @@ /* * fam15h_power.c - AMD Family 15h processor power monitoring * - * Copyright (c) 2011 Advanced Micro Devices, Inc. + * Copyright (c) 2011-2016 Advanced Micro Devices, Inc. * Author: Andreas Herrmann * * -- 1.9.1
[PATCH v5 6/6] hwmon: (fam15h_power) Add platform check function
This patch adds a platform check function to make code more readable. Signed-off-by: Huang Rui <ray.hu...@amd.com> --- drivers/hwmon/fam15h_power.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index c1cad26..622c646 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -78,6 +78,11 @@ struct fam15h_power_data { unsigned long power_period; }; +static bool is_carrizo_or_later(void) +{ + return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60; +} + static ssize_t show_power(struct device *dev, struct device_attribute *attr, char *buf) { @@ -94,7 +99,7 @@ static ssize_t show_power(struct device *dev, * On Carrizo and later platforms, TdpRunAvgAccCap bit field * is extended to 4:31 from 4:25. */ - if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60) { + if (is_carrizo_or_later()) { running_avg_capture = val >> 4; running_avg_capture = sign_extend32(running_avg_capture, 27); } else { @@ -111,7 +116,7 @@ static ssize_t show_power(struct device *dev, * On Carrizo and later platforms, ApmTdpLimit bit field * is extended to 16:31 from 16:28. */ - if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60) + if (is_carrizo_or_later()) tdp_limit = val >> 16; else tdp_limit = (val >> 16) & 0x1fff; -- 1.9.1
[PATCH v5 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm
This patch introduces an algorithm that computes the average power by reading a delta value of “core power accumulator” register during measurement interval, and then dividing delta value by the length of the time interval. User is able to use power1_average entry to measure the processor power consumption and power1_average_interval entry to set the interval. A simple example: ray@hr-ub:~/tip$ sensors fam15h_power-pci-00c4 Adapter: PCI adapter power1: 19.58 mW (avg = 2.55 mW, interval = 0.01 s) (crit = 15.00 W) ... The result is current average processor power consumption in 10 millisecond. The unit of the result is uWatt. Suggested-by: Guenter Roeck <li...@roeck-us.net> Signed-off-by: Huang Rui <ray.hu...@amd.com> Cc: Borislav Petkov <b...@alien8.de> --- drivers/hwmon/fam15h_power.c | 119 +++ 1 file changed, 119 insertions(+) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index de6f52b..003564b 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -27,6 +27,8 @@ #include #include #include +#include +#include #include #include @@ -48,6 +50,9 @@ MODULE_LICENSE("GPL"); #define FAM15H_NUM_GROUPS 2 #define MAX_CUS8 +/* set maximum interval as 1 second */ +#define MAX_INTERVAL 1000 + #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b #define MSR_F15H_PTSC 0xc0010280 @@ -68,6 +73,9 @@ struct fam15h_power_data { u64 cu_acc_power[MAX_CUS]; /* performance timestamp counter */ u64 cpu_sw_pwr_ptsc[MAX_CUS]; + /* online/offline status of current compute unit */ + int cu_on[MAX_CUS]; + unsigned long power_period; }; static ssize_t show_power(struct device *dev, @@ -149,6 +157,8 @@ static void do_read_registers_on_cu(void *_data) rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]); + + data->cu_on[cu] = 1; } /* @@ -165,6 +175,8 @@ static int read_registers(struct fam15h_power_data *data) if (!ret) return -ENOMEM; + memset(data->cu_on, 0, sizeof(int) * MAX_CUS); + get_online_cpus(); this_cpu = get_cpu(); @@ -192,18 +204,117 @@ static int read_registers(struct fam15h_power_data *data) return 0; } +static ssize_t acc_show_power(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct fam15h_power_data *data = dev_get_drvdata(dev); + u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS], + jdelta[MAX_CUS]; + u64 tdelta, avg_acc; + int cu, cu_num, ret; + signed long leftover; + + /* +* With the new x86 topology modelling, x86_max_cores is the +* compute unit number. +*/ + cu_num = boot_cpu_data.x86_max_cores; + + ret = read_registers(data); + if (ret) + return 0; + + for (cu = 0; cu < cu_num; cu++) { + prev_cu_acc_power[cu] = data->cu_acc_power[cu]; + prev_ptsc[cu] = data->cpu_sw_pwr_ptsc[cu]; + } + + leftover = schedule_timeout_interruptible(msecs_to_jiffies(data->power_period)); + if (leftover) + return 0; + + ret = read_registers(data); + if (ret) + return 0; + + for (cu = 0, avg_acc = 0; cu < cu_num; cu++) { + /* check if current compute unit is online */ + if (data->cu_on[cu] == 0) + continue; + + if (data->cu_acc_power[cu] < prev_cu_acc_power[cu]) { + jdelta[cu] = data->max_cu_acc_power + data->cu_acc_power[cu]; + jdelta[cu] -= prev_cu_acc_power[cu]; + } else { + jdelta[cu] = data->cu_acc_power[cu] - prev_cu_acc_power[cu]; + } + tdelta = data->cpu_sw_pwr_ptsc[cu] - prev_ptsc[cu]; + jdelta[cu] *= data->cpu_pwr_sample_ratio * 1000; + do_div(jdelta[cu], tdelta); + + /* the unit is microWatt */ + avg_acc += jdelta[cu]; + } + + return sprintf(buf, "%llu\n", (unsigned long long)avg_acc); +} +static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL); + +static ssize_t acc_show_power_period(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct fam15h_power_data *data = dev_get_drvdata(dev); + + return sprintf(buf, "%lu\n", data->power_period); +} + +static ssize_t acc_set_power_period(struct device *dev, +
[PATCH v5 6/6] hwmon: (fam15h_power) Add platform check function
This patch adds a platform check function to make code more readable. Signed-off-by: Huang Rui --- drivers/hwmon/fam15h_power.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index c1cad26..622c646 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -78,6 +78,11 @@ struct fam15h_power_data { unsigned long power_period; }; +static bool is_carrizo_or_later(void) +{ + return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60; +} + static ssize_t show_power(struct device *dev, struct device_attribute *attr, char *buf) { @@ -94,7 +99,7 @@ static ssize_t show_power(struct device *dev, * On Carrizo and later platforms, TdpRunAvgAccCap bit field * is extended to 4:31 from 4:25. */ - if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60) { + if (is_carrizo_or_later()) { running_avg_capture = val >> 4; running_avg_capture = sign_extend32(running_avg_capture, 27); } else { @@ -111,7 +116,7 @@ static ssize_t show_power(struct device *dev, * On Carrizo and later platforms, ApmTdpLimit bit field * is extended to 16:31 from 16:28. */ - if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60) + if (is_carrizo_or_later()) tdp_limit = val >> 16; else tdp_limit = (val >> 16) & 0x1fff; -- 1.9.1
[PATCH v5 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm
This patch introduces an algorithm that computes the average power by reading a delta value of “core power accumulator” register during measurement interval, and then dividing delta value by the length of the time interval. User is able to use power1_average entry to measure the processor power consumption and power1_average_interval entry to set the interval. A simple example: ray@hr-ub:~/tip$ sensors fam15h_power-pci-00c4 Adapter: PCI adapter power1: 19.58 mW (avg = 2.55 mW, interval = 0.01 s) (crit = 15.00 W) ... The result is current average processor power consumption in 10 millisecond. The unit of the result is uWatt. Suggested-by: Guenter Roeck Signed-off-by: Huang Rui Cc: Borislav Petkov --- drivers/hwmon/fam15h_power.c | 119 +++ 1 file changed, 119 insertions(+) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index de6f52b..003564b 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -27,6 +27,8 @@ #include #include #include +#include +#include #include #include @@ -48,6 +50,9 @@ MODULE_LICENSE("GPL"); #define FAM15H_NUM_GROUPS 2 #define MAX_CUS8 +/* set maximum interval as 1 second */ +#define MAX_INTERVAL 1000 + #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b #define MSR_F15H_PTSC 0xc0010280 @@ -68,6 +73,9 @@ struct fam15h_power_data { u64 cu_acc_power[MAX_CUS]; /* performance timestamp counter */ u64 cpu_sw_pwr_ptsc[MAX_CUS]; + /* online/offline status of current compute unit */ + int cu_on[MAX_CUS]; + unsigned long power_period; }; static ssize_t show_power(struct device *dev, @@ -149,6 +157,8 @@ static void do_read_registers_on_cu(void *_data) rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]); + + data->cu_on[cu] = 1; } /* @@ -165,6 +175,8 @@ static int read_registers(struct fam15h_power_data *data) if (!ret) return -ENOMEM; + memset(data->cu_on, 0, sizeof(int) * MAX_CUS); + get_online_cpus(); this_cpu = get_cpu(); @@ -192,18 +204,117 @@ static int read_registers(struct fam15h_power_data *data) return 0; } +static ssize_t acc_show_power(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct fam15h_power_data *data = dev_get_drvdata(dev); + u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS], + jdelta[MAX_CUS]; + u64 tdelta, avg_acc; + int cu, cu_num, ret; + signed long leftover; + + /* +* With the new x86 topology modelling, x86_max_cores is the +* compute unit number. +*/ + cu_num = boot_cpu_data.x86_max_cores; + + ret = read_registers(data); + if (ret) + return 0; + + for (cu = 0; cu < cu_num; cu++) { + prev_cu_acc_power[cu] = data->cu_acc_power[cu]; + prev_ptsc[cu] = data->cpu_sw_pwr_ptsc[cu]; + } + + leftover = schedule_timeout_interruptible(msecs_to_jiffies(data->power_period)); + if (leftover) + return 0; + + ret = read_registers(data); + if (ret) + return 0; + + for (cu = 0, avg_acc = 0; cu < cu_num; cu++) { + /* check if current compute unit is online */ + if (data->cu_on[cu] == 0) + continue; + + if (data->cu_acc_power[cu] < prev_cu_acc_power[cu]) { + jdelta[cu] = data->max_cu_acc_power + data->cu_acc_power[cu]; + jdelta[cu] -= prev_cu_acc_power[cu]; + } else { + jdelta[cu] = data->cu_acc_power[cu] - prev_cu_acc_power[cu]; + } + tdelta = data->cpu_sw_pwr_ptsc[cu] - prev_ptsc[cu]; + jdelta[cu] *= data->cpu_pwr_sample_ratio * 1000; + do_div(jdelta[cu], tdelta); + + /* the unit is microWatt */ + avg_acc += jdelta[cu]; + } + + return sprintf(buf, "%llu\n", (unsigned long long)avg_acc); +} +static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL); + +static ssize_t acc_show_power_period(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct fam15h_power_data *data = dev_get_drvdata(dev); + + return sprintf(buf, "%lu\n", data->power_period); +} + +static ssize_t acc_set_power_period(struct device *dev, + struct device_attribute *attr, +
[PATCH v5 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm
Hi Guenter, This serial of patches introduces an accumulated power reporting algorithm. It will calculate the average power consumption for the processor. The cpu feature flag is CPUID.8000_0007H:EDX[12]. This algorithm is used to test the comparison of processor power consumption with between MWAITX delay and TSC delay on AMD Carrizo platforms. Reference: https://lkml.kernel.org/r/1438744732-1459-1-git-send-email-ray.hu...@amd.com Commit f96756 at tip ("x86/asm: Add MONITORX/MWAITX instruction support") Commit b466bd at tip ("x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer") V1: https://lkml.kernel.org/r/1440662866-28716-1-git-send-email-ray.hu...@amd.com V2: https://lkml.kernel.org/r/1445308109-17970-1-git-send-email-ray.hu...@amd.com V3: https://lkml.kernel.org/r/1446199024-1896-1-git-send-email-ray.hu...@amd.com V4: https://lkml.kernel.org/r/1457662670-3354-1-git-send-email-ray.hu...@amd.com On V5, I used the new x86 topology with compute unit id (cpu_core_id) and compute unit numbers (x86_max_cores) for AMD processors. The change and documentation will go to master soon as below link. And This way is better to use smp_num_siblings which is un-defined variable out of CONFIG_SMP. Refer: https://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/commit/?h=tip-urgent=f6ce1851fa2eec2c332255fc25a544658dbfbfe4 Changes from v1 -> v2: - Move fam15h_power_groups and fam15h_power_group into fam15h_power_data to avoid overwrite on multi-CPU system. - Rename FAM15H_MIN_NUM_ATTRS macro and fix return error code. - Remove unnecessary warning print. - Adds do_read_registers_on_cu to do all the read to all MSRs and run it on one of the online cores on each compute unit with smp_call_function_many(). - Use power1_average and power1_average_interval standard entry instread of power1_acc - Fix the CPU-hotplug case. Changes from v2 -> v3: - As Guenter's suggestion, remove typecast, use >groups[0]. - Remove all "fam15_power_*" prefix at data. - Remove unnecessary ( ). - Fix the issue that is reported by build test robot, and add CPU_SUP_AMD as the dependence of fam15h_power - Remove the WARN_ON at do_read_registers_on_cu, because it must be behind CPUID check. The MSR must be available since CPUID.8000_0007H:EDX[12] is set - Add get_online_cpus()/put_online_cpus() functions. - Refine commments and the method which generate cpumask for cu. - Add the interval scope to make the value suitable for user experience - Remove the useless mutex. Changes from v3 -> v4: - Rebase the patches to latest groeck/hwmon-next. - Use smp_num_siblings instead of cores_per_cu accessor. - Refine the cpumask method which is inspired by perf solution. - Fix some typo and errors. Changes from v4 -> v5: - Rebase patches to v4.6-rc1. - Use new x86 topology with compute id (cpu_core_id) and compute unit number (x86_max_cores) instead of smp_num_siblings. A simple example: ray@hr-ub:~/tip$ sensors fam15h_power-pci-00c4 Adapter: PCI adapter power1: 19.58 mW (avg = 2.55 mW, interval = 0.01 s) (crit = 15.00 W) ... These patches are rebased on v4.6-rc1. Thanks, Rui Huang Rui (6): hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence hwmon: (fam15h_power) Add compute unit accumulated power hwmon: (fam15h_power) Add ptsc counter value for accumulated power hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm hwmon: (fam15h_power) Add platform check function Documentation/hwmon/fam15h_power | 57 ++- drivers/hwmon/Kconfig| 2 +- drivers/hwmon/fam15h_power.c | 199 ++- 3 files changed, 252 insertions(+), 6 deletions(-) -- 1.9.1
[PATCH v5 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm
Hi Guenter, This serial of patches introduces an accumulated power reporting algorithm. It will calculate the average power consumption for the processor. The cpu feature flag is CPUID.8000_0007H:EDX[12]. This algorithm is used to test the comparison of processor power consumption with between MWAITX delay and TSC delay on AMD Carrizo platforms. Reference: https://lkml.kernel.org/r/1438744732-1459-1-git-send-email-ray.hu...@amd.com Commit f96756 at tip ("x86/asm: Add MONITORX/MWAITX instruction support") Commit b466bd at tip ("x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer") V1: https://lkml.kernel.org/r/1440662866-28716-1-git-send-email-ray.hu...@amd.com V2: https://lkml.kernel.org/r/1445308109-17970-1-git-send-email-ray.hu...@amd.com V3: https://lkml.kernel.org/r/1446199024-1896-1-git-send-email-ray.hu...@amd.com V4: https://lkml.kernel.org/r/1457662670-3354-1-git-send-email-ray.hu...@amd.com On V5, I used the new x86 topology with compute unit id (cpu_core_id) and compute unit numbers (x86_max_cores) for AMD processors. The change and documentation will go to master soon as below link. And This way is better to use smp_num_siblings which is un-defined variable out of CONFIG_SMP. Refer: https://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/commit/?h=tip-urgent=f6ce1851fa2eec2c332255fc25a544658dbfbfe4 Changes from v1 -> v2: - Move fam15h_power_groups and fam15h_power_group into fam15h_power_data to avoid overwrite on multi-CPU system. - Rename FAM15H_MIN_NUM_ATTRS macro and fix return error code. - Remove unnecessary warning print. - Adds do_read_registers_on_cu to do all the read to all MSRs and run it on one of the online cores on each compute unit with smp_call_function_many(). - Use power1_average and power1_average_interval standard entry instread of power1_acc - Fix the CPU-hotplug case. Changes from v2 -> v3: - As Guenter's suggestion, remove typecast, use >groups[0]. - Remove all "fam15_power_*" prefix at data. - Remove unnecessary ( ). - Fix the issue that is reported by build test robot, and add CPU_SUP_AMD as the dependence of fam15h_power - Remove the WARN_ON at do_read_registers_on_cu, because it must be behind CPUID check. The MSR must be available since CPUID.8000_0007H:EDX[12] is set - Add get_online_cpus()/put_online_cpus() functions. - Refine commments and the method which generate cpumask for cu. - Add the interval scope to make the value suitable for user experience - Remove the useless mutex. Changes from v3 -> v4: - Rebase the patches to latest groeck/hwmon-next. - Use smp_num_siblings instead of cores_per_cu accessor. - Refine the cpumask method which is inspired by perf solution. - Fix some typo and errors. Changes from v4 -> v5: - Rebase patches to v4.6-rc1. - Use new x86 topology with compute id (cpu_core_id) and compute unit number (x86_max_cores) instead of smp_num_siblings. A simple example: ray@hr-ub:~/tip$ sensors fam15h_power-pci-00c4 Adapter: PCI adapter power1: 19.58 mW (avg = 2.55 mW, interval = 0.01 s) (crit = 15.00 W) ... These patches are rebased on v4.6-rc1. Thanks, Rui Huang Rui (6): hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence hwmon: (fam15h_power) Add compute unit accumulated power hwmon: (fam15h_power) Add ptsc counter value for accumulated power hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm hwmon: (fam15h_power) Add platform check function Documentation/hwmon/fam15h_power | 57 ++- drivers/hwmon/Kconfig| 2 +- drivers/hwmon/fam15h_power.c | 199 ++- 3 files changed, 252 insertions(+), 6 deletions(-) -- 1.9.1
[PATCH v5 1/6] hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence
This patch adds CONFIG_CPU_SUP_AMD as the dependence of fam15h_power driver. Because the following patch will use the interface from x86/kernel/cpu/amd.c. Otherwise, the below error might be encountered: All errors (new ones prefixed by >>): drivers/built-in.o: In function `fam15h_power_probe': >> fam15h_power.c:(.text+0x26e3a3): undefined reference to >> `amd_get_cores_per_cu' fam15h_power.c:(.text+0x26e41e): undefined reference to `amd_get_cores_per_cu' Reported-by: build test robot <l...@intel.com> Signed-off-by: Huang Rui <ray.hu...@amd.com> --- drivers/hwmon/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig index 5c2d13a..4be3792 100644 --- a/drivers/hwmon/Kconfig +++ b/drivers/hwmon/Kconfig @@ -288,7 +288,7 @@ config SENSORS_K10TEMP config SENSORS_FAM15H_POWER tristate "AMD Family 15h processor power" - depends on X86 && PCI + depends on X86 && PCI && CPU_SUP_AMD help If you say yes here you get support for processor power information of your AMD family 15h CPU. -- 1.9.1
[PATCH v5 3/6] hwmon: (fam15h_power) Add ptsc counter value for accumulated power
PTSC is the performance timestamp counter value in a cpu core and the cores in one compute unit have the fixed frequency. So it picks up the performance timestamp counter value of the first core per compute unit to measure the interval for average power per compute unit. Signed-off-by: Huang Rui <ray.hu...@amd.com> Cc: Borislav Petkov <b...@alien8.de> --- drivers/hwmon/fam15h_power.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index ccbc944..de6f52b 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -50,6 +50,7 @@ MODULE_LICENSE("GPL"); #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b +#define MSR_F15H_PTSC 0xc0010280 #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4 @@ -65,6 +66,8 @@ struct fam15h_power_data { u64 max_cu_acc_power; /* accumulated power of the compute units */ u64 cu_acc_power[MAX_CUS]; + /* performance timestamp counter */ + u64 cpu_sw_pwr_ptsc[MAX_CUS]; }; static ssize_t show_power(struct device *dev, @@ -145,6 +148,7 @@ static void do_read_registers_on_cu(void *_data) cu = cpu_data(cpu).cpu_core_id; rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); + rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]); } /* -- 1.9.1
[PATCH v5 1/6] hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence
This patch adds CONFIG_CPU_SUP_AMD as the dependence of fam15h_power driver. Because the following patch will use the interface from x86/kernel/cpu/amd.c. Otherwise, the below error might be encountered: All errors (new ones prefixed by >>): drivers/built-in.o: In function `fam15h_power_probe': >> fam15h_power.c:(.text+0x26e3a3): undefined reference to >> `amd_get_cores_per_cu' fam15h_power.c:(.text+0x26e41e): undefined reference to `amd_get_cores_per_cu' Reported-by: build test robot Signed-off-by: Huang Rui --- drivers/hwmon/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig index 5c2d13a..4be3792 100644 --- a/drivers/hwmon/Kconfig +++ b/drivers/hwmon/Kconfig @@ -288,7 +288,7 @@ config SENSORS_K10TEMP config SENSORS_FAM15H_POWER tristate "AMD Family 15h processor power" - depends on X86 && PCI + depends on X86 && PCI && CPU_SUP_AMD help If you say yes here you get support for processor power information of your AMD family 15h CPU. -- 1.9.1
[PATCH v5 3/6] hwmon: (fam15h_power) Add ptsc counter value for accumulated power
PTSC is the performance timestamp counter value in a cpu core and the cores in one compute unit have the fixed frequency. So it picks up the performance timestamp counter value of the first core per compute unit to measure the interval for average power per compute unit. Signed-off-by: Huang Rui Cc: Borislav Petkov --- drivers/hwmon/fam15h_power.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index ccbc944..de6f52b 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -50,6 +50,7 @@ MODULE_LICENSE("GPL"); #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b +#define MSR_F15H_PTSC 0xc0010280 #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4 @@ -65,6 +66,8 @@ struct fam15h_power_data { u64 max_cu_acc_power; /* accumulated power of the compute units */ u64 cu_acc_power[MAX_CUS]; + /* performance timestamp counter */ + u64 cpu_sw_pwr_ptsc[MAX_CUS]; }; static ssize_t show_power(struct device *dev, @@ -145,6 +148,7 @@ static void do_read_registers_on_cu(void *_data) cu = cpu_data(cpu).cpu_core_id; rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); + rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]); } /* -- 1.9.1
[PATCH v5 2/6] hwmon: (fam15h_power) Add compute unit accumulated power
This patch adds a member in fam15h_power_data which specifies the compute unit accumulated power. It adds do_read_registers_on_cu to do all the read to all MSRs and run it on one of the online cores on each compute unit with smp_call_function_many(). This behavior can decrease IPI numbers. Suggested-by: Borislav Petkov <b...@alien8.de> Signed-off-by: Huang Rui <ray.hu...@amd.com> --- drivers/hwmon/fam15h_power.c | 65 +++- 1 file changed, 64 insertions(+), 1 deletion(-) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 4f695d8..ccbc944 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -25,6 +25,8 @@ #include #include #include +#include +#include #include #include @@ -44,7 +46,9 @@ MODULE_LICENSE("GPL"); #define FAM15H_MIN_NUM_ATTRS 2 #define FAM15H_NUM_GROUPS 2 +#define MAX_CUS8 +#define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4 @@ -59,6 +63,8 @@ struct fam15h_power_data { struct attribute_group group; /* maximum accumulated power of a compute unit */ u64 max_cu_acc_power; + /* accumulated power of the compute units */ + u64 cu_acc_power[MAX_CUS]; }; static ssize_t show_power(struct device *dev, @@ -125,6 +131,63 @@ static ssize_t show_power_crit(struct device *dev, } static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL); +static void do_read_registers_on_cu(void *_data) +{ + struct fam15h_power_data *data = _data; + int cpu, cu; + + cpu = smp_processor_id(); + + /* +* With the new x86 topology modelling, cpu core id actually +* is compute unit id. +*/ + cu = cpu_data(cpu).cpu_core_id; + + rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); +} + +/* + * This function is only able to be called when CPUID + * Fn8000_0007:EDX[12] is set. + */ +static int read_registers(struct fam15h_power_data *data) +{ + int this_cpu, ret, cpu; + int target; + cpumask_var_t mask; + + ret = zalloc_cpumask_var(, GFP_KERNEL); + if (!ret) + return -ENOMEM; + + get_online_cpus(); + this_cpu = get_cpu(); + + /* +* Choose the first online core of each compute unit, and then +* read their MSR value of power and ptsc in a single IPI, +* because the MSR value of CPU core represent the compute +* unit's. +*/ + for_each_online_cpu(cpu) { + target = cpumask_first(topology_sibling_cpumask(cpu)); + if (!cpumask_test_cpu(target, mask)) + cpumask_set_cpu(target, mask); + } + + if (cpumask_test_cpu(this_cpu, mask)) + do_read_registers_on_cu(data); + + smp_call_function_many(mask, do_read_registers_on_cu, data, true); + put_cpu(); + put_online_cpus(); + + free_cpumask_var(mask); + + return 0; +} + static int fam15h_power_init_attrs(struct pci_dev *pdev, struct fam15h_power_data *data) { @@ -263,7 +326,7 @@ static int fam15h_power_init_data(struct pci_dev *f4, data->max_cu_acc_power = tmp; - return 0; + return read_registers(data); } static int fam15h_power_probe(struct pci_dev *pdev, -- 1.9.1
[PATCH v5 2/6] hwmon: (fam15h_power) Add compute unit accumulated power
This patch adds a member in fam15h_power_data which specifies the compute unit accumulated power. It adds do_read_registers_on_cu to do all the read to all MSRs and run it on one of the online cores on each compute unit with smp_call_function_many(). This behavior can decrease IPI numbers. Suggested-by: Borislav Petkov Signed-off-by: Huang Rui --- drivers/hwmon/fam15h_power.c | 65 +++- 1 file changed, 64 insertions(+), 1 deletion(-) diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c index 4f695d8..ccbc944 100644 --- a/drivers/hwmon/fam15h_power.c +++ b/drivers/hwmon/fam15h_power.c @@ -25,6 +25,8 @@ #include #include #include +#include +#include #include #include @@ -44,7 +46,9 @@ MODULE_LICENSE("GPL"); #define FAM15H_MIN_NUM_ATTRS 2 #define FAM15H_NUM_GROUPS 2 +#define MAX_CUS8 +#define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4 @@ -59,6 +63,8 @@ struct fam15h_power_data { struct attribute_group group; /* maximum accumulated power of a compute unit */ u64 max_cu_acc_power; + /* accumulated power of the compute units */ + u64 cu_acc_power[MAX_CUS]; }; static ssize_t show_power(struct device *dev, @@ -125,6 +131,63 @@ static ssize_t show_power_crit(struct device *dev, } static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL); +static void do_read_registers_on_cu(void *_data) +{ + struct fam15h_power_data *data = _data; + int cpu, cu; + + cpu = smp_processor_id(); + + /* +* With the new x86 topology modelling, cpu core id actually +* is compute unit id. +*/ + cu = cpu_data(cpu).cpu_core_id; + + rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]); +} + +/* + * This function is only able to be called when CPUID + * Fn8000_0007:EDX[12] is set. + */ +static int read_registers(struct fam15h_power_data *data) +{ + int this_cpu, ret, cpu; + int target; + cpumask_var_t mask; + + ret = zalloc_cpumask_var(, GFP_KERNEL); + if (!ret) + return -ENOMEM; + + get_online_cpus(); + this_cpu = get_cpu(); + + /* +* Choose the first online core of each compute unit, and then +* read their MSR value of power and ptsc in a single IPI, +* because the MSR value of CPU core represent the compute +* unit's. +*/ + for_each_online_cpu(cpu) { + target = cpumask_first(topology_sibling_cpumask(cpu)); + if (!cpumask_test_cpu(target, mask)) + cpumask_set_cpu(target, mask); + } + + if (cpumask_test_cpu(this_cpu, mask)) + do_read_registers_on_cu(data); + + smp_call_function_many(mask, do_read_registers_on_cu, data, true); + put_cpu(); + put_online_cpus(); + + free_cpumask_var(mask); + + return 0; +} + static int fam15h_power_init_attrs(struct pci_dev *pdev, struct fam15h_power_data *data) { @@ -263,7 +326,7 @@ static int fam15h_power_init_data(struct pci_dev *f4, data->max_cu_acc_power = tmp; - return 0; + return read_registers(data); } static int fam15h_power_probe(struct pci_dev *pdev, -- 1.9.1
[tip:perf/urgent] perf/x86: Move events_sysfs_show() outside CPU_SUP_INTEL
Commit-ID: a49ac9f83b31e41c8311d64bd2b3f97a23dcb38d Gitweb: http://git.kernel.org/tip/a49ac9f83b31e41c8311d64bd2b3f97a23dcb38d Author: Huang Rui <ray.hu...@amd.com> AuthorDate: Fri, 25 Mar 2016 11:18:25 +0800 Committer: Ingo Molnar <mi...@kernel.org> CommitDate: Fri, 25 Mar 2016 09:46:53 +0100 perf/x86: Move events_sysfs_show() outside CPU_SUP_INTEL randconfig builds can sometimes disable CONFIG_CPU_SUP_INTEL while enabling the AMD power reporting PMU driver, resulting in this build failure: arch/x86/kernel/cpu/perf_event.h:663:31: error: 'events_sysfs_show' undeclared here (not in a function) To fix it, move events_sysfs_show() outside of #ifdef CONFIG_CPU_SUP_INTEL. Reported-by: Randy Dunlap <rdun...@infradead.org> Reported-by: build test robot <l...@intel.com> Signed-off-by: Huang Rui <ray.hu...@amd.com> Cc: Borislav Petkov <b...@alien8.de> Cc: Borislav Petkov <b...@suse.de> Cc: Fengguang Wu <fengguang...@intel.com> Cc: Linus Torvalds <torva...@linux-foundation.org> Cc: Peter Zijlstra <pet...@infradead.org> Cc: Sherry Hurwitz <sherry.hurw...@amd.com> Cc: Stephen Rothwell <s...@canb.auug.org.au> Cc: Thomas Gleixner <t...@linutronix.de> Cc: a...@kernel.org Cc: kbuild-...@01.org Cc: linux-n...@vger.kernel.org Cc: spg_linux_ker...@amd.com Link: http://lkml.kernel.org/r/1458875905-4278-1-git-send-email-ray.hu...@amd.com Signed-off-by: Ingo Molnar <mi...@kernel.org> --- arch/x86/events/perf_event.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index ba6ef18..a6771e2 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -795,6 +795,9 @@ ssize_t intel_event_sysfs_show(char *page, u64 config); struct attribute **merge_attr(struct attribute **a, struct attribute **b); +ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr, + char *page); + #ifdef CONFIG_CPU_SUP_AMD int amd_pmu_init(void); @@ -925,9 +928,6 @@ int p6_pmu_init(void); int knc_pmu_init(void); -ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr, - char *page); - static inline int is_ht_workaround_enabled(void) { return !!(x86_pmu.flags & PMU_FL_EXCL_ENABLED);
[tip:perf/urgent] perf/x86: Move events_sysfs_show() outside CPU_SUP_INTEL
Commit-ID: a49ac9f83b31e41c8311d64bd2b3f97a23dcb38d Gitweb: http://git.kernel.org/tip/a49ac9f83b31e41c8311d64bd2b3f97a23dcb38d Author: Huang Rui AuthorDate: Fri, 25 Mar 2016 11:18:25 +0800 Committer: Ingo Molnar CommitDate: Fri, 25 Mar 2016 09:46:53 +0100 perf/x86: Move events_sysfs_show() outside CPU_SUP_INTEL randconfig builds can sometimes disable CONFIG_CPU_SUP_INTEL while enabling the AMD power reporting PMU driver, resulting in this build failure: arch/x86/kernel/cpu/perf_event.h:663:31: error: 'events_sysfs_show' undeclared here (not in a function) To fix it, move events_sysfs_show() outside of #ifdef CONFIG_CPU_SUP_INTEL. Reported-by: Randy Dunlap Reported-by: build test robot Signed-off-by: Huang Rui Cc: Borislav Petkov Cc: Borislav Petkov Cc: Fengguang Wu Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Sherry Hurwitz Cc: Stephen Rothwell Cc: Thomas Gleixner Cc: a...@kernel.org Cc: kbuild-...@01.org Cc: linux-n...@vger.kernel.org Cc: spg_linux_ker...@amd.com Link: http://lkml.kernel.org/r/1458875905-4278-1-git-send-email-ray.hu...@amd.com Signed-off-by: Ingo Molnar --- arch/x86/events/perf_event.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index ba6ef18..a6771e2 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -795,6 +795,9 @@ ssize_t intel_event_sysfs_show(char *page, u64 config); struct attribute **merge_attr(struct attribute **a, struct attribute **b); +ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr, + char *page); + #ifdef CONFIG_CPU_SUP_AMD int amd_pmu_init(void); @@ -925,9 +928,6 @@ int p6_pmu_init(void); int knc_pmu_init(void); -ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr, - char *page); - static inline int is_ht_workaround_enabled(void) { return !!(x86_pmu.flags & PMU_FL_EXCL_ENABLED);
[REDO PATCH] x86/perf: Move events_sysfs_show() outside CPU_SUP_INTEL
This patch moves events_sysfs_show outside CONFIG_CPU_SUP_INTEL, because this interface will be also used in an AMD power reporting PMU driver. Otherwise, below build error would be encountered: All error/warnings (new ones prefixed by >>): In file included from include/linux/kobject.h:21:0, from include/linux/module.h:17, from arch/x86/kernel/cpu/perf_event_amd_power.c:13: >> arch/x86/kernel/cpu/perf_event.h:663:31: error: 'events_sysfs_show' undeclared here (not in a function) .attr = __ATTR(_name, 0444, events_sysfs_show, NULL), \ ^ include/linux/sysfs.h:103:10: note: in definition of macro '__ATTR' .show = _show, \ ^ >> arch/x86/kernel/cpu/perf_event_amd_power.c:244:1: note: in expansion of macro 'EVENT_ATTR_STR' EVENT_ATTR_STR(power-pkg, power_pkg, "event=0x01"); ^ Reported-by: Randy Dunlap <rdun...@infradead.org> Reported-by: build test robot <l...@intel.com> Signed-off-by: Huang Rui <ray.hu...@amd.com> Cc: Stephen Rothwell <s...@canb.auug.org.au> Cc: a...@kernel.org --- Hi Peter, Ingo, Randy, According to the comments of below thread, I rebased this fix to latest kernel. http://lkml.kernel.org/r/56f41e3f.8000...@infradead.org Orignal patch: https://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/commit/?h=tip-perf=f5ba42d22cb478d32363b2b8e92e14b1fd190ce1 Thanks, Rui --- arch/x86/events/perf_event.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index ba6ef18..a6771e2 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -795,6 +795,9 @@ ssize_t intel_event_sysfs_show(char *page, u64 config); struct attribute **merge_attr(struct attribute **a, struct attribute **b); +ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr, + char *page); + #ifdef CONFIG_CPU_SUP_AMD int amd_pmu_init(void); @@ -925,9 +928,6 @@ int p6_pmu_init(void); int knc_pmu_init(void); -ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr, - char *page); - static inline int is_ht_workaround_enabled(void) { return !!(x86_pmu.flags & PMU_FL_EXCL_ENABLED); -- 1.9.1
[REDO PATCH] x86/perf: Move events_sysfs_show() outside CPU_SUP_INTEL
This patch moves events_sysfs_show outside CONFIG_CPU_SUP_INTEL, because this interface will be also used in an AMD power reporting PMU driver. Otherwise, below build error would be encountered: All error/warnings (new ones prefixed by >>): In file included from include/linux/kobject.h:21:0, from include/linux/module.h:17, from arch/x86/kernel/cpu/perf_event_amd_power.c:13: >> arch/x86/kernel/cpu/perf_event.h:663:31: error: 'events_sysfs_show' undeclared here (not in a function) .attr = __ATTR(_name, 0444, events_sysfs_show, NULL), \ ^ include/linux/sysfs.h:103:10: note: in definition of macro '__ATTR' .show = _show, \ ^ >> arch/x86/kernel/cpu/perf_event_amd_power.c:244:1: note: in expansion of macro 'EVENT_ATTR_STR' EVENT_ATTR_STR(power-pkg, power_pkg, "event=0x01"); ^ Reported-by: Randy Dunlap Reported-by: build test robot Signed-off-by: Huang Rui Cc: Stephen Rothwell Cc: a...@kernel.org --- Hi Peter, Ingo, Randy, According to the comments of below thread, I rebased this fix to latest kernel. http://lkml.kernel.org/r/56f41e3f.8000...@infradead.org Orignal patch: https://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/commit/?h=tip-perf=f5ba42d22cb478d32363b2b8e92e14b1fd190ce1 Thanks, Rui --- arch/x86/events/perf_event.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index ba6ef18..a6771e2 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -795,6 +795,9 @@ ssize_t intel_event_sysfs_show(char *page, u64 config); struct attribute **merge_attr(struct attribute **a, struct attribute **b); +ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr, + char *page); + #ifdef CONFIG_CPU_SUP_AMD int amd_pmu_init(void); @@ -925,9 +928,6 @@ int p6_pmu_init(void); int knc_pmu_init(void); -ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr, - char *page); - static inline int is_ht_workaround_enabled(void) { return !!(x86_pmu.flags & PMU_FL_EXCL_ENABLED); -- 1.9.1
Re: linux-next: Tree for Mar 24 (events/amd/power)
On Thu, Mar 24, 2016 at 10:05:03AM -0700, Randy Dunlap wrote: > On 03/23/16 19:09, Stephen Rothwell wrote: > > Hi all, > > > > Please do not add any v4.7 related material to your linux-next included > > trees until after v4.6-rc1 is released. > > > > Changes since 20160323: > > > > on i386: > > when CONFIG_CPU_SUP_INTEL is not enabled: > > In file included from ../include/linux/kobject.h:21:0, > from ../include/linux/module.h:17, > from ../arch/x86/events/amd/power.c:13: > ../arch/x86/events/amd/../perf_event.h:660:31: error: 'events_sysfs_show' > undeclared here (not in a function) > .attr = __ATTR(_name, 0444, events_sysfs_show, NULL), \ >^ > ../include/linux/sysfs.h:103:10: note: in definition of macro '__ATTR' > .show = _show, \ > ^ > ../arch/x86/events/amd/power.c:181:1: note: in expansion of macro > 'EVENT_ATTR_STR' > EVENT_ATTR_STR(power-pkg, power_pkg, "event=0x01"); > ^ > ../scripts/Makefile.build:291: recipe for target > 'arch/x86/events/amd/power.o' failed > Randy, thanks to report it. Actually, because one patch is missed to apply. https://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/commit/?h=tip-perf=f5ba42d22cb478d32363b2b8e92e14b1fd190ce1 I will rebase and re-send it. Thanks, Rui
Re: linux-next: Tree for Mar 24 (events/amd/power)
On Thu, Mar 24, 2016 at 10:05:03AM -0700, Randy Dunlap wrote: > On 03/23/16 19:09, Stephen Rothwell wrote: > > Hi all, > > > > Please do not add any v4.7 related material to your linux-next included > > trees until after v4.6-rc1 is released. > > > > Changes since 20160323: > > > > on i386: > > when CONFIG_CPU_SUP_INTEL is not enabled: > > In file included from ../include/linux/kobject.h:21:0, > from ../include/linux/module.h:17, > from ../arch/x86/events/amd/power.c:13: > ../arch/x86/events/amd/../perf_event.h:660:31: error: 'events_sysfs_show' > undeclared here (not in a function) > .attr = __ATTR(_name, 0444, events_sysfs_show, NULL), \ >^ > ../include/linux/sysfs.h:103:10: note: in definition of macro '__ATTR' > .show = _show, \ > ^ > ../arch/x86/events/amd/power.c:181:1: note: in expansion of macro > 'EVENT_ATTR_STR' > EVENT_ATTR_STR(power-pkg, power_pkg, "event=0x01"); > ^ > ../scripts/Makefile.build:291: recipe for target > 'arch/x86/events/amd/power.o' failed > Randy, thanks to report it. Actually, because one patch is missed to apply. https://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/commit/?h=tip-perf=f5ba42d22cb478d32363b2b8e92e14b1fd190ce1 I will rebase and re-send it. Thanks, Rui