from:"Huang Rui"

Re: [PATCH] x86, sched: Fix the AMD CPPC maximum perf on some specific generations

2021-04-20 Thread Huang Rui

On Tue, Apr 20, 2021 at 04:22:31PM +0800, Borislav Petkov wrote:
> On Tue, Apr 20, 2021 at 04:09:43PM +0800, Huang Rui wrote:
> > Some AMD Ryzen generations has different calculation method on maximum
> > perf. 255 is not for all asics, some specific generations used 166 as
> > the maximum perf. This patch is to fix the different maximum perf value
> 
> Avoid having "This patch" or "This commit" in the commit message. It is
> tautologically useless.
> 
> Also, do
> 
> $ git grep 'This patch' Documentation/process
> 
> for more details.

Thanks and good to know, I will enhance the commit message in V2.

> 
> > of AMD CPPC.
> > 
> > Fixes: 41ea667227ba ("x86, sched: Calculate frequency invariance for AMD 
> > systems")
> > Fixes: 3c55e94c0ade ("cpufreq: ACPI: Extend frequency tables to cover boost 
> > frequencies")
> > 
> > Reported-by: Jason Bagavatsingham 
> > Tested-by: Jason Bagavatsingham 
> > Signed-off-by: Huang Rui 
> > Cc: Alex Deucher 
> > Cc: Nathan Fontenot 
> > Cc: Rafael J. Wysocki 
> > Cc: Borislav Petkov 
> > Cc: x...@kernel.org
> > ---
> >  arch/x86/kernel/smpboot.c  | 33 ++-
> >  drivers/cpufreq/acpi-cpufreq.c | 41 ++
> >  2 files changed, 73 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> > index 02813a7f3a7c..705bc5ceb1ea 100644
> > --- a/arch/x86/kernel/smpboot.c
> > +++ b/arch/x86/kernel/smpboot.c
> > @@ -2033,6 +2033,37 @@ static bool intel_set_max_freq_ratio(void)
> >  }
> >  
> >  #ifdef CONFIG_ACPI_CPPC_LIB
> > +static u64 amd_get_highest_perf(void)
> > +{
> 
>   struct cpuinfo_x86 *c = _cpu_data;
> 
> and then you can use "c" everywhere.
> 
> > +   u64 cppc_max_perf;
> 
> u64 for something which fits in a byte?
> 
> Also,
>   max_perf = 255;
> 
> and you can remove the else and default branches below.

I aligned with highest_perf type in get_max_boost_ratio() funciton.

Will clean the "else" and "default" branches in V2.

> 
> > +
> > +   switch (boot_cpu_data.x86) {
> > +   case 0x17:
> > +   if ((boot_cpu_data.x86_model >= 0x30 &&
> > +boot_cpu_data.x86_model < 0x40) ||
> > +   (boot_cpu_data.x86_model >= 0x70 &&
> > +boot_cpu_data.x86_model < 0x80))
> > +   cppc_max_perf = 166;
> > +   else
> > +   cppc_max_perf = 255;
> > +   break;
> > +   case 0x19:
> > +   if ((boot_cpu_data.x86_model >= 0x20 &&
> > +boot_cpu_data.x86_model < 0x30) ||
> > +   (boot_cpu_data.x86_model >= 0x40 &&
> > +boot_cpu_data.x86_model < 0x70))
> > +   cppc_max_perf = 166;
> > +   else
> > +   cppc_max_perf = 255;
> > +   break;
> > +   default:
> > +   cppc_max_perf = 255;
> > +   break;
> > +   }
> > +
> > +   return cppc_max_perf;
> > +}
> 
> Why is this here and not in arch/x86/kernel/cpu/amd.c?

OK, I will modify to abstract this function in amd.c and then call it both
on smpboot and acpi-cpufreq.

> 
> > +
> >  static bool amd_set_max_freq_ratio(void)
> >  {
> > struct cppc_perf_caps perf_caps;
> 
> 
> 
> > @@ -2046,8 +2077,8 @@ static bool amd_set_max_freq_ratio(void)
> > return false;
> > }
> >  
> > -   highest_perf = perf_caps.highest_perf;
> > nominal_perf = perf_caps.nominal_perf;
> > +   highest_perf = amd_get_highest_perf();
> >  
> > if (!highest_perf || !nominal_perf) {
> > pr_debug("Could not retrieve highest or nominal performance\n");
> > diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
> > index d1bbc16fba4b..e5c03360db20 100644
> > --- a/drivers/cpufreq/acpi-cpufreq.c
> > +++ b/drivers/cpufreq/acpi-cpufreq.c
> > @@ -630,6 +630,44 @@ static int acpi_cpufreq_blacklist(struct cpuinfo_x86 
> > *c)
> >  #endif
> >  
> >  #ifdef CONFIG_ACPI_CPPC_LIB
> > +
> > +static u64 get_amd_max_boost_ratio(unsigned int cpu, u64 nominal_perf)
> > +{
> > +   u64 boost_ratio, cppc_max_perf;
> > +
> > +   if (!nominal_perf)
> > +   return 0;
> > +
> > +   switch (boot_cpu_data.x86) {
> > +   case

[PATCH] x86, sched: Fix the AMD CPPC maximum perf on some specific generations

2021-04-20 Thread Huang Rui

Some AMD Ryzen generations has different calculation method on maximum
perf. 255 is not for all asics, some specific generations used 166 as
the maximum perf. This patch is to fix the different maximum perf value
of AMD CPPC.

Fixes: 41ea667227ba ("x86, sched: Calculate frequency invariance for AMD 
systems")
Fixes: 3c55e94c0ade ("cpufreq: ACPI: Extend frequency tables to cover boost 
frequencies")

Reported-by: Jason Bagavatsingham 
Tested-by: Jason Bagavatsingham 
Signed-off-by: Huang Rui 
Cc: Alex Deucher 
Cc: Nathan Fontenot 
Cc: Rafael J. Wysocki 
Cc: Borislav Petkov 
Cc: x...@kernel.org
---
 arch/x86/kernel/smpboot.c  | 33 ++-
 drivers/cpufreq/acpi-cpufreq.c | 41 ++
 2 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 02813a7f3a7c..705bc5ceb1ea 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -2033,6 +2033,37 @@ static bool intel_set_max_freq_ratio(void)
 }
 
 #ifdef CONFIG_ACPI_CPPC_LIB
+static u64 amd_get_highest_perf(void)
+{
+   u64 cppc_max_perf;
+
+   switch (boot_cpu_data.x86) {
+   case 0x17:
+   if ((boot_cpu_data.x86_model >= 0x30 &&
+boot_cpu_data.x86_model < 0x40) ||
+   (boot_cpu_data.x86_model >= 0x70 &&
+boot_cpu_data.x86_model < 0x80))
+   cppc_max_perf = 166;
+   else
+   cppc_max_perf = 255;
+   break;
+   case 0x19:
+   if ((boot_cpu_data.x86_model >= 0x20 &&
+boot_cpu_data.x86_model < 0x30) ||
+   (boot_cpu_data.x86_model >= 0x40 &&
+boot_cpu_data.x86_model < 0x70))
+   cppc_max_perf = 166;
+   else
+   cppc_max_perf = 255;
+   break;
+   default:
+   cppc_max_perf = 255;
+   break;
+   }
+
+   return cppc_max_perf;
+}
+
 static bool amd_set_max_freq_ratio(void)
 {
struct cppc_perf_caps perf_caps;
@@ -2046,8 +2077,8 @@ static bool amd_set_max_freq_ratio(void)
return false;
}
 
-   highest_perf = perf_caps.highest_perf;
nominal_perf = perf_caps.nominal_perf;
+   highest_perf = amd_get_highest_perf();
 
if (!highest_perf || !nominal_perf) {
pr_debug("Could not retrieve highest or nominal performance\n");
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index d1bbc16fba4b..e5c03360db20 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -630,6 +630,44 @@ static int acpi_cpufreq_blacklist(struct cpuinfo_x86 *c)
 #endif
 
 #ifdef CONFIG_ACPI_CPPC_LIB
+
+static u64 get_amd_max_boost_ratio(unsigned int cpu, u64 nominal_perf)
+{
+   u64 boost_ratio, cppc_max_perf;
+
+   if (!nominal_perf)
+   return 0;
+
+   switch (boot_cpu_data.x86) {
+   case 0x17:
+   if ((boot_cpu_data.x86_model >= 0x30 &&
+boot_cpu_data.x86_model < 0x40) ||
+   (boot_cpu_data.x86_model >= 0x70 &&
+boot_cpu_data.x86_model < 0x80))
+   cppc_max_perf = 166;
+   else
+   cppc_max_perf = 255;
+   break;
+   case 0x19:
+   if ((boot_cpu_data.x86_model >= 0x20 &&
+boot_cpu_data.x86_model < 0x30) ||
+   (boot_cpu_data.x86_model >= 0x40 &&
+boot_cpu_data.x86_model < 0x70))
+   cppc_max_perf = 166;
+   else
+   cppc_max_perf = 255;
+   break;
+   default:
+   cppc_max_perf = 255;
+   break;
+   }
+
+   boost_ratio = div_u64(cppc_max_perf << SCHED_CAPACITY_SHIFT,
+ nominal_perf);
+
+   return boost_ratio;
+}
+
 static u64 get_max_boost_ratio(unsigned int cpu)
 {
struct cppc_perf_caps perf_caps;
@@ -646,6 +684,9 @@ static u64 get_max_boost_ratio(unsigned int cpu)
return 0;
}
 
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+   return get_amd_max_boost_ratio(cpu, perf_caps.nominal_perf);
+
highest_perf = perf_caps.highest_perf;
nominal_perf = perf_caps.nominal_perf;
 
-- 
2.25.1

Re: [PATCH 2/2] drm/ttm: optimize the pool shrinker a bit v2

2021-04-15 Thread Huang Rui

On Thu, Apr 15, 2021 at 07:56:24PM +0800, Christian König wrote:
> Switch back to using a spinlock again by moving the IOMMU unmap outside
> of the locked region.
> 
> v2: Add a comment explaining why we need sync_shrinkers().
> 
> Signed-off-by: Christian König 

Series look good for me as well.

Acked-by: Huang Rui 

> ---
>  drivers/gpu/drm/ttm/ttm_pool.c | 44 +-
>  1 file changed, 22 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
> index cb38b1a17b09..955836d569cc 100644
> --- a/drivers/gpu/drm/ttm/ttm_pool.c
> +++ b/drivers/gpu/drm/ttm/ttm_pool.c
> @@ -70,7 +70,7 @@ static struct ttm_pool_type global_uncached[MAX_ORDER];
>  static struct ttm_pool_type global_dma32_write_combined[MAX_ORDER];
>  static struct ttm_pool_type global_dma32_uncached[MAX_ORDER];
>  
> -static struct mutex shrinker_lock;
> +static spinlock_t shrinker_lock;
>  static struct list_head shrinker_list;
>  static struct shrinker mm_shrinker;
>  
> @@ -263,9 +263,9 @@ static void ttm_pool_type_init(struct ttm_pool_type *pt, 
> struct ttm_pool *pool,
>   spin_lock_init(>lock);
>   INIT_LIST_HEAD(>pages);
>  
> - mutex_lock(_lock);
> + spin_lock(_lock);
>   list_add_tail(>shrinker_list, _list);
> - mutex_unlock(_lock);
> + spin_unlock(_lock);
>  }
>  
>  /* Remove a pool_type from the global shrinker list and free all pages */
> @@ -273,9 +273,9 @@ static void ttm_pool_type_fini(struct ttm_pool_type *pt)
>  {
>   struct page *p;
>  
> - mutex_lock(_lock);
> + spin_lock(_lock);
>   list_del(>shrinker_list);
> - mutex_unlock(_lock);
> + spin_unlock(_lock);
>  
>   while ((p = ttm_pool_type_take(pt)))
>   ttm_pool_free_page(pt->pool, pt->caching, pt->order, p);
> @@ -313,24 +313,19 @@ static struct ttm_pool_type 
> *ttm_pool_select_type(struct ttm_pool *pool,
>  static unsigned int ttm_pool_shrink(void)
>  {
>   struct ttm_pool_type *pt;
> - unsigned int num_freed;
>   struct page *p;
>  
> - mutex_lock(_lock);
> + spin_lock(_lock);
>   pt = list_first_entry(_list, typeof(*pt), shrinker_list);
> + list_move_tail(>shrinker_list, _list);
> + spin_unlock(_lock);
>  
>   p = ttm_pool_type_take(pt);
> - if (p) {
> - ttm_pool_free_page(pt->pool, pt->caching, pt->order, p);
> - num_freed = 1 << pt->order;
> - } else {
> - num_freed = 0;
> - }
> -
> - list_move_tail(>shrinker_list, _list);
> - mutex_unlock(_lock);
> + if (!p)
> + return 0;
>  
> - return num_freed;
> + ttm_pool_free_page(pt->pool, pt->caching, pt->order, p);
> + return 1 << pt->order;
>  }
>  
>  /* Return the allocation order based for a page */
> @@ -530,6 +525,11 @@ void ttm_pool_fini(struct ttm_pool *pool)
>   for (j = 0; j < MAX_ORDER; ++j)
>   ttm_pool_type_fini(>caching[i].orders[j]);
>   }
> +
> + /* We removed the pool types from the LRU, but we need to also make sure
> +  * that no shrinker is concurrently freeing pages from the pool.
> +  */
> + sync_shrinkers();
>  }
>  
>  /* As long as pages are available make sure to release at least one */
> @@ -604,7 +604,7 @@ static int ttm_pool_debugfs_globals_show(struct seq_file 
> *m, void *data)
>  {
>   ttm_pool_debugfs_header(m);
>  
> - mutex_lock(_lock);
> + spin_lock(_lock);
>   seq_puts(m, "wc\t:");
>   ttm_pool_debugfs_orders(global_write_combined, m);
>   seq_puts(m, "uc\t:");
> @@ -613,7 +613,7 @@ static int ttm_pool_debugfs_globals_show(struct seq_file 
> *m, void *data)
>   ttm_pool_debugfs_orders(global_dma32_write_combined, m);
>   seq_puts(m, "uc 32\t:");
>   ttm_pool_debugfs_orders(global_dma32_uncached, m);
> - mutex_unlock(_lock);
> + spin_unlock(_lock);
>  
>   ttm_pool_debugfs_footer(m);
>  
> @@ -640,7 +640,7 @@ int ttm_pool_debugfs(struct ttm_pool *pool, struct 
> seq_file *m)
>  
>   ttm_pool_debugfs_header(m);
>  
> - mutex_lock(_lock);
> + spin_lock(_lock);
>   for (i = 0; i < TTM_NUM_CACHING_TYPES; ++i) {
>   seq_puts(m, "DMA ");
>   switch (i) {
> @@ -656,7 +656,7 @@ int ttm_pool_debugfs(struct ttm_pool *pool, struct 
> seq_file *m)
>   }
>   ttm_pool_debugfs_orders(pool->caching[i].orders, m);
>   }
> - mutex_unlock(_lock);
> + spin_unlock(_lock);
>  
>   ttm_pool_debugfs_footer(m);
>   return 0;
> @@ -693,7 +693,7 @@ int ttm_pool_mgr_init(unsigned long num_pages)
>   if (!page_pool_size)
>   page_pool_size = num_pages;
>  
> - mutex_init(_lock);
> + spin_lock_init(_lock);
>   INIT_LIST_HEAD(_list);
>  
>   for (i = 0; i < MAX_ORDER; ++i) {
> -- 
> 2.25.1
>

Re: your mail

2021-04-07 Thread Huang Rui

On Wed, Apr 07, 2021 at 09:27:46AM +0800, songqiang wrote:

Please add the description in the commit message and subject.

Thanks,
Ray

> Signed-off-by: songqiang 
> ---
>  drivers/gpu/drm/ttm/ttm_page_alloc.c | 18 ++
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
> b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> index 14660f723f71..f3698f0ad4d7 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> @@ -736,8 +736,16 @@ static void ttm_put_pages(struct page **pages, unsigned 
> npages, int flags,
>   if (++p != pages[i + j])
>   break;
>  
> - if (j == HPAGE_PMD_NR)
> + if (j == HPAGE_PMD_NR) {
>   order = HPAGE_PMD_ORDER;
> + for (j = 1; j < HPAGE_PMD_NR; ++j)
> + page_ref_dec(pages[i+j]);
> + }
>   }
>  #endif
>  
> @@ -868,10 +876,12 @@ static int ttm_get_pages(struct page **pages, unsigned 
> npages, int flags,
>   p = alloc_pages(huge_flags, HPAGE_PMD_ORDER);
>   if (!p)
>   break;
> -
> - for (j = 0; j < HPAGE_PMD_NR; ++j)
> + for (j = 0; j < HPAGE_PMD_NR; ++j) {
>   pages[i++] = p++;
> -
> + if (j > 0)
> + page_ref_inc(pages[i-1]);
> + }
>   npages -= HPAGE_PMD_NR;
>   }
>   }
> 
> 
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-develdata=04%7C01%7Cray.huang%40amd.com%7C4ccc617b77d746db5af108d8f98db612%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533734805563118%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=9bSP90LYdJyJYJYmuphVmqk%2B3%2FE4JPrtXkQTbxwAt68%3Dreserved=0

Re: [PATCH 0/3] iommu/amd: Fix booting with amd_iommu=off

2021-03-17 Thread Huang Rui

On Wed, Mar 17, 2021 at 05:10:34PM +0800, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> Hi,
> 
> it turned out that booting a kernel with amd_iommu=off on a machine
> that has an AMD IOMMU causes an early kernel crash. There are two
> reasons for this, and fixing one of them is already sufficient, but
> both reasons deserve fixing, which is done in this patch-set.
> 
> Regards,
> 
>   Joerg
> 
> Joerg Roedel (3):
>   iommu/amd: Move Stoney Ridge check to detect_ivrs()
>   iommu/amd: Don't call early_amd_iommu_init() when AMD IOMMU is
> disabled
>   iommu/amd: Keep track of amd_iommu_irq_remap state

Series are Acked-by: Huang Rui 

> 
>  drivers/iommu/amd/init.c | 36 
>  1 file changed, 20 insertions(+), 16 deletions(-)
> 
> -- 
> 2.30.2
>

Re: [PATCH] iommu/amd: Fix iommu remap panic while amd_iommu is set to disable

2021-03-17 Thread Huang Rui

On Tue, Mar 16, 2021 at 11:00:26PM +0800, Joerg Roedel wrote:
> On Tue, Mar 16, 2021 at 09:36:02PM +0800, Huang Rui wrote:
> > Thanks for the comments. Could you please elaborate this?
> > 
> > Do you mean while amd_iommu=off, we won't prepare the IVRS, and even
> > needn't get all ACPI talbes. Because they are never be used and the next
> > state will always goes into IOMMU_CMDLINE_DISABLED, am I right?
> 
> The first problem was that amd_iommu_irq_remap is never set back to
> false when irq-remapping initialization fails in amd_iommu_prepare().
> 
> But there are other problems, like that even when the IOMMU is set to
> disabled on the command line with amd_iommu=off, the code still sets up
> all IOMMUs and registers IRQ domains for them.
> 

Yes, that was the problem I found in my side. No very clear about the
orignal intention. But if you said this is a problem, that makes sense...

> Later the code checks wheter the IOMMU should stay disabled and tears
> everything down, except for the IRQ domains, which stay in the global
> list.
> 
> The APIs do not really support tearing down IRQ domains well, so its not
> so easy to add this to the tear-down path. Now that the IRQ domains stay
> in the list, the ACPI code will come along later and calls the
> ->select() call-back for every IRQ domain, which gets execution to
> irq_remapping_select(), depite IOMMU being disabled and
> amd_iommu_rlookup_table already de-allocated. But since
> amd_iommu_irq_remap is still true the NULL pointer is dereferenced,
> causing the crash.

OK. We found some memleaks here before as well. It looks cause by iommu data
buffer initialization and not be cleared. And yes, if setup iommu here, it
actually causes many issues.

unreferenced object 0x888332047500 (size 224):
  comm "swapper/0", pid 0, jiffies 4294892296 (age 362.192s)
  hex dump (first 32 bytes):
b0 22 03 00 00 00 00 00 00 00 00 40 00 00 00 00  .".@
06 00 00 00 00 00 00 00 00 20 00 00 00 20 00 00  . ... ..
  backtrace:
[<8f162024>] kmem_cache_alloc+0x145/0x440
[<420093ba>] kmem_cache_create_usercopy+0x127/0x2c0
[<f5ed7ff0>] kmem_cache_create+0x16/0x20
[<4c1e4f47>] iommu_go_to_state+0x8e4/0x165d
[<a191b705>] amd_iommu_prepare+0x1a/0x2f
[<afe5b97e>] irq_remapping_prepare+0x35/0x6d
[<209f36b5>] enable_IR_x2apic+0x2b/0x1ae
[<b64491b5>] default_setup_apic_routing+0x16/0x65
[<e89c64a1>] apic_intr_mode_init+0x81/0x10a
[<eaa14bf6>] x86_late_time_init+0x24/0x35
[<e291fc71>] start_kernel+0x509/0x5c7
[<99930dfe>] x86_64_start_reservations+0x24/0x26
[<b369765d>] x86_64_start_kernel+0x75/0x79
[<4f411919>] secondary_startup_64_no_verify+0xb0/0xbb

> 
> When the IRQ domains would not be around, this would also not happen. So
> my patches also change the initializtion to not do all the setup work
> when amd_iommu=off was passed on the command line.
> 

Thanks for the fix and explaination. :-)

Best Regards,
Ray

Re: [PATCH] iommu/amd: Fix iommu remap panic while amd_iommu is set to disable

2021-03-16 Thread Huang Rui

On Tue, Mar 16, 2021 at 09:16:34PM +0800, Joerg Roedel wrote:
> Hi Huang,
> 
> On Thu, Mar 11, 2021 at 10:28:07PM +0800, Huang Rui wrote:
> > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> > index f0adbc48fd17..a08e885403b7 100644
> > --- a/drivers/iommu/amd/iommu.c
> > +++ b/drivers/iommu/amd/iommu.c
> > @@ -3862,7 +3862,7 @@ static int irq_remapping_select(struct irq_domain *d, 
> > struct irq_fwspec *fwspec,
> > else if (x86_fwspec_is_hpet(fwspec))
> > devid = get_hpet_devid(fwspec->param[0]);
> >  
> > -   if (devid < 0)
> > +   if (devid < 0 || !amd_iommu_rlookup_table)
> > return 0;
> 
> The problem is deeper than this fix suggests. I prepared other fixes for
> this particular problem. Please find them here:
> 

Thanks for the comments. Could you please elaborate this?

Do you mean while amd_iommu=off, we won't prepare the IVRS, and even
needn't get all ACPI talbes. Because they are never be used and the next
state will always goes into IOMMU_CMDLINE_DISABLED, am I right?

Thanks,
Ray

>   
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fjoro%2Flinux.git%2Flog%2F%3Fh%3Diommu-fixesdata=04%7C01%7Cray.huang%40amd.com%7Cce731dda3b444ac9a14108d8e87dbb3e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637514974013915073%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=NVahf1tIfgno%2BNWPu8hY4DygiGWdKXBJ8G6OsD%2BHC14%3Dreserved=0
> 
> Regards,
> 
>   Joerg

Re: [PATCH V2] drm: amd: pm: Mundane typo fixes in the file amdgpu_pm.c

2021-03-14 Thread Huang Rui

On Mon, Mar 15, 2021 at 11:21:36AM +0800, Bhaskar Chowdhury wrote:
> 
> s/"an minimum"/"a minimum"/
> s/"an maxmum"/"a maximum"/
> 
> Signed-off-by: Bhaskar Chowdhury 

Reviewed-by: Huang Rui 

> ---
>  Changes from V1:
>   Randy's suggestion to adjust the subject line text
>   And missed out a spell too,which now included
> 
>  drivers/gpu/drm/amd/pm/amdgpu_pm.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> index 5fa65f191a37..308249ae1a22 100644
> --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> @@ -3315,9 +3315,9 @@ static ssize_t amdgpu_hwmon_show_mclk_label(struct 
> device *dev,
>   *
>   * - pwm1_max: pulse width modulation fan control maximum level (255)
>   *
> - * - fan1_min: an minimum value Unit: revolution/min (RPM)
> + * - fan1_min: a minimum value Unit: revolution/min (RPM)
>   *
> - * - fan1_max: an maxmum value Unit: revolution/max (RPM)
> + * - fan1_max: a maximum value Unit: revolution/max (RPM)
>   *
>   * - fan1_input: fan speed in RPM
>   *
> --
> 2.30.2
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-develdata=04%7C01%7Cray.huang%40amd.com%7C7815a224727f4c9b556008d8e76182b4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637513753293707291%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=ApeKXDkijsihPcGSoI9v8ypRcsUXSb2Y7%2FpKsm3c4Xo%3Dreserved=0

[PATCH] iommu/amd: Fix iommu remap panic while amd_iommu is set to disable

2021-03-11 Thread Huang Rui

While amd_iommu is set to disable in cmd line, it will free the iommu
resources. Then the pages of rlookup table is freed as well. If that, we
have to check rlookup table in irq_remapping_select(), otherwise, it
will trigger a kernel panic below.

[2.245855] BUG: kernel NULL pointer dereference, address: 0500
[2.252861] #PF: supervisor read access in kernel mode
[2.258053] #PF: error_code(0x) - not-present page
[2.263247] PGD 0 P4D 0
[2.265844] Oops:  [#1] SMP NOPTI
[2.269570] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.11.0-custom #1
[2.276150] Hardware name: AMD Chachani-VN/Chachani-VN, BIOS VCH162755N.FD 
03/04/2021
[2.284085] RIP: 0010:irq_remapping_select+0x5c/0xb0
[2.289107] Code: 4b 0c 48 3d 70 7c c8 8f 75 0d eb 35 48 8b 00 48 3d 70 7c 
c8 8f 74 2a 0f b6 50 10 39 d1 75 ed 0f b7 40 12 48 8b 15 f4 8a 3b 02 <48> 8b 14 
c2 48 85 d2 74 0e b8 01 00 00 00 4c 3b a2 90 04 00 00 74
[2.307999] RSP: :8f403d40 EFLAGS: 00010246
[2.313285] RAX: 00a0 RBX: 8f403d98 RCX: 0021
[2.320471] RDX:  RSI: 000a RDI: 8cbb4006118a
[2.327658] RBP: 8f403d50 R08: 0021 R09: 0002
[2.334838] R10: 000a R11: f000 R12: 8cbb401bfe40
[2.342022] R13:  R14: 8cbb401be900 R15: 
[2.349210] FS:  () GS:8cbd73e0() 
knlGS:
[2.357399] CS:  0010 DS:  ES:  CR0: 80050033
[2.363198] CR2: 0500 CR3: 0002dae0a000 CR4: 000406b0
[2.370382] Call Trace:
[2.372897]  irq_find_matching_fwspec+0x48/0xd0
[2.377489]  mp_irqdomain_create+0x7c/0x180
[2.381736]  ? __raw_callee_save___native_queued_spin_unlock+0x15/0x23
[2.388320]  setup_IO_APIC+0x81/0x875
[2.392048]  ? clear_IO_APIC_pin+0xd6/0x130
[2.396294]  ? clear_IO_APIC+0x39/0x60
[2.400103]  apic_intr_mode_init+0x107/0x10a
[2.404432]  x86_late_time_init+0x24/0x35
[2.408507]  start_kernel+0x509/0x5c7
[2.412230]  x86_64_start_reservations+0x24/0x26
[2.416911]  x86_64_start_kernel+0x75/0x79
[2.421068]  secondary_startup_64_no_verify+0xb0/0xbb

Fixes: a1a785b572425 ("iommu/amd: Implement select() method on remapping 
irqdomain")
Tested-by: Xiaojian Du 
Signed-off-by: Huang Rui 
Cc: Joerg Roedel 
Cc: Suravee Suthikulpanit 
Cc: Alex Deucher 
Cc: sta...@vger.kernel.org
---
 drivers/iommu/amd/iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index f0adbc48fd17..a08e885403b7 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3862,7 +3862,7 @@ static int irq_remapping_select(struct irq_domain *d, 
struct irq_fwspec *fwspec,
else if (x86_fwspec_is_hpet(fwspec))
devid = get_hpet_devid(fwspec->param[0]);
 
-   if (devid < 0)
+   if (devid < 0 || !amd_iommu_rlookup_table)
return 0;
 
iommu = amd_iommu_rlookup_table[devid];
-- 
2.25.1

Re: [PATCH][next] drm/amdgpu: Fix masking binary not operator on two mask operations

2021-01-24 Thread Huang Rui

On Fri, Jan 22, 2021 at 11:00:22PM +0800, Colin King wrote:
> From: Colin Ian King 
> 
> Currently the ! operator is incorrectly being used to flip bits on
> mask values. Fix this by using the bit-wise ~ operator instead.
> 
> Addresses-Coverity: ("Logical vs. bitwise operator")
> Fixes: 3c9a7b7d6e75 ("drm/amdgpu: update mmhub mgcg for mmhub_v2_3")
> Signed-off-by: Colin Ian King 

Thanks.

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c 
> b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c
> index 1961745e89c7..ab9be5ad5a5f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_3.c
> @@ -531,12 +531,12 @@ mmhub_v2_3_update_medium_grain_light_sleep(struct 
> amdgpu_device *adev,
>  
>   if (enable && (adev->cg_flags & AMD_CG_SUPPORT_MC_LS)) {
>   data &= ~MM_ATC_L2_CGTT_CLK_CTRL__MGLS_OVERRIDE_MASK;
> - data1 &= !(DAGB0_WR_CGTT_CLK_CTRL__LS_OVERRIDE_MASK |
> + data1 &= ~(DAGB0_WR_CGTT_CLK_CTRL__LS_OVERRIDE_MASK |
>   DAGB0_WR_CGTT_CLK_CTRL__LS_OVERRIDE_WRITE_MASK |
>   DAGB0_WR_CGTT_CLK_CTRL__LS_OVERRIDE_READ_MASK |
>   DAGB0_WR_CGTT_CLK_CTRL__LS_OVERRIDE_RETURN_MASK |
>   DAGB0_WR_CGTT_CLK_CTRL__LS_OVERRIDE_REGISTER_MASK);
> - data2 &= !(DAGB0_RD_CGTT_CLK_CTRL__LS_OVERRIDE_MASK |
> + data2 &= ~(DAGB0_RD_CGTT_CLK_CTRL__LS_OVERRIDE_MASK |
>   DAGB0_RD_CGTT_CLK_CTRL__LS_OVERRIDE_WRITE_MASK |
>   DAGB0_RD_CGTT_CLK_CTRL__LS_OVERRIDE_READ_MASK |
>   DAGB0_RD_CGTT_CLK_CTRL__LS_OVERRIDE_RETURN_MASK |
> -- 
> 2.29.2
>

Re: linux-next: build warning after merge of the amdgpu tree

2021-01-18 Thread Huang Rui

On Mon, Jan 18, 2021 at 02:52:18PM +0800, Stephen Rothwell wrote:
> Hi Ray,
> 
> On Mon, 18 Jan 2021 05:56:20 + "Huang, Ray"  wrote:
> >
> > [AMD Public Use]
> > 
> > Thanks Setphen. Please check attached V2 patch.
> > (If you can share the config file to me, I can do the test in my side)
> 
> It looks good, but I did not try to build it.  I have attached the
> config (it is just a powerpc allyesconfig).
> -- 
> Cheers,
> Stephen Rothwell

Thanks Stephen.

Ray

Re: linux-next: build failure after merge of the amdgpu tree

2021-01-14 Thread Huang Rui

On Fri, Jan 15, 2021 at 01:35:05PM +0800, Stephen Rothwell wrote:
> Hi all,
> 
> After merging the amdgpu tree, today's linux-next build (powerpc
> allyesconfig) failed like this:
> 
> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c: In function 
> 'vangogh_get_smu_metrics_data':
> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:300:10: error: 
> 'boot_cpu_data' undeclared (first use in this function); did you mean 
> 'boot_cpuid'?

Ah, vangogh is an x86 cpu, let me look at this issue.

Could you share me the config file you tested?

Thanks,
Ray

>   300 |  boot_cpu_data.x86_max_cores * sizeof(uint16_t));
>   |  ^
>   |  boot_cpuid
> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c: In function 
> 'vangogh_read_sensor':
> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:1320:11: error: 
> 'boot_cpu_data' undeclared (first use in this function); did you mean 
> 'boot_cpuid'?
>  1320 |   *size = boot_cpu_data.x86_max_cores * sizeof(uint16_t);
>   |   ^
>   |   boot_cpuid
> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c: In function 
> 'vangogh_od_edit_dpm_table':
> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:1460:19: error: 
> 'boot_cpu_data' undeclared (first use in this function); did you mean 
> 'boot_cpuid'?
>  1460 |   if (input[0] >= boot_cpu_data.x86_max_cores) {
>   |   ^
>   |   boot_cpuid
> 
> Caused by commits
> 
>   517cb957c43b ("drm/amd/pm: implement the processor clocks which read by 
> metric")
>   0d90d0ddd10e ("drm/amd/pm: implement processor fine grain feature for 
> vangogh (v3)")
> 
> The only thing I could do easily is to disable CONFIG_DRM_AMDGPU for today.
> 
> -- 
> Cheers,
> Stephen Rothwell

Re: [PATCH] drm: amdgpu: pm: Mark vangogh_clk_dpm_is_enabled() as static

2021-01-12 Thread Huang Rui

On Wed, Jan 13, 2021 at 10:21:26AM +0800, Huang Rui wrote:
> On Wed, Jan 13, 2021 at 10:13:02AM +0800, Alex Deucher wrote:
> > On Tue, Jan 12, 2021 at 8:19 PM Huang Rui  wrote:
> > >
> > > On Wed, Jan 13, 2021 at 03:57:22AM +0800, Souptick Joarder wrote:
> > > > kernel test robot throws below warnings ->
> > > >
> > > > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:594:6:
> > > > warning: no previous prototype for 'vangogh_clk_dpm_is_enabled'
> > > > [-Wmissing-prototypes]
> > > > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:594:6:
> > > > warning: no previous prototype for function 'vangogh_clk_dpm_is_enabled'
> > > > [-Wmissing-prototypes]
> > > >
> > > > Mark vangogh_clk_dpm_is_enabled() as static.
> > > >
> > > > Reported-by: kernel test robot 
> > > > Signed-off-by: Souptick Joarder 
> > > > ---
> > > >  drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c 
> > > > b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> > > > index 75ddcad..3ffe56e 100644
> > > > --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> > > > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> > > > @@ -610,7 +610,7 @@ static int vangogh_get_profiling_clk_mask(struct 
> > > > smu_context *smu,
> > > >   return 0;
> > > >  }
> > > >
> > > > -bool vangogh_clk_dpm_is_enabled(struct smu_context *smu,
> > > > +static bool vangogh_clk_dpm_is_enabled(struct smu_context *smu,
> > > >   enum smu_clk_type clk_type)
> > >
> > > Ah, I have another patch which will use this function in another file.
> > >
> > 
> > I can drop it if you plan to land those patches soon.
> 
> Thanks Alex. Yes, I will upload them after verify them on the new firmware
> today.

Sorry Alex, I miss read the function name as "cclk_dpm".
This patch is good, please go forward to apply it.

Reviewed-by: Huang Rui 

Thanks,
Ray

> 
> Thanks,
> Ray
> 
> > 
> > Alex
> > 
> > 
> > > Thanks,
> > > Ray
> > >
> > > >  {
> > > >   enum smu_feature_mask feature_id = 0;
> > > > --
> > > > 1.9.1
> > > >
> > > ___
> > > amd-gfx mailing list
> > > amd-...@lists.freedesktop.org
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7Cray.huang%40amd.com%7C19fce6891f2d4f6df7de08d8b768c8a9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637461007972405505%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=v2JGTkklMHOE2hN1s4dYZ1hT7ctLeUHpwkpn1M3nyi8%3Dreserved=0

Re: [PATCH] drm: amdgpu: pm: Mark vangogh_clk_dpm_is_enabled() as static

2021-01-12 Thread Huang Rui

On Wed, Jan 13, 2021 at 10:13:02AM +0800, Alex Deucher wrote:
> On Tue, Jan 12, 2021 at 8:19 PM Huang Rui  wrote:
> >
> > On Wed, Jan 13, 2021 at 03:57:22AM +0800, Souptick Joarder wrote:
> > > kernel test robot throws below warnings ->
> > >
> > > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:594:6:
> > > warning: no previous prototype for 'vangogh_clk_dpm_is_enabled'
> > > [-Wmissing-prototypes]
> > > drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:594:6:
> > > warning: no previous prototype for function 'vangogh_clk_dpm_is_enabled'
> > > [-Wmissing-prototypes]
> > >
> > > Mark vangogh_clk_dpm_is_enabled() as static.
> > >
> > > Reported-by: kernel test robot 
> > > Signed-off-by: Souptick Joarder 
> > > ---
> > >  drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c 
> > > b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> > > index 75ddcad..3ffe56e 100644
> > > --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> > > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> > > @@ -610,7 +610,7 @@ static int vangogh_get_profiling_clk_mask(struct 
> > > smu_context *smu,
> > >   return 0;
> > >  }
> > >
> > > -bool vangogh_clk_dpm_is_enabled(struct smu_context *smu,
> > > +static bool vangogh_clk_dpm_is_enabled(struct smu_context *smu,
> > >   enum smu_clk_type clk_type)
> >
> > Ah, I have another patch which will use this function in another file.
> >
> 
> I can drop it if you plan to land those patches soon.

Thanks Alex. Yes, I will upload them after verify them on the new firmware
today.

Thanks,
Ray

> 
> Alex
> 
> 
> > Thanks,
> > Ray
> >
> > >  {
> > >   enum smu_feature_mask feature_id = 0;
> > > --
> > > 1.9.1
> > >
> > ___
> > amd-gfx mailing list
> > amd-...@lists.freedesktop.org
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7Cray.huang%40amd.com%7C19fce6891f2d4f6df7de08d8b768c8a9%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637461007972405505%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=v2JGTkklMHOE2hN1s4dYZ1hT7ctLeUHpwkpn1M3nyi8%3Dreserved=0

Re: [PATCH] drm: amdgpu: pm: Mark vangogh_clk_dpm_is_enabled() as static

2021-01-12 Thread Huang Rui

On Wed, Jan 13, 2021 at 03:57:22AM +0800, Souptick Joarder wrote:
> kernel test robot throws below warnings ->
> 
> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:594:6:
> warning: no previous prototype for 'vangogh_clk_dpm_is_enabled'
> [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/vangogh_ppt.c:594:6:
> warning: no previous prototype for function 'vangogh_clk_dpm_is_enabled'
> [-Wmissing-prototypes]
> 
> Mark vangogh_clk_dpm_is_enabled() as static.
> 
> Reported-by: kernel test robot 
> Signed-off-by: Souptick Joarder 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> index 75ddcad..3ffe56e 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
> @@ -610,7 +610,7 @@ static int vangogh_get_profiling_clk_mask(struct 
> smu_context *smu,
>   return 0;
>  }
>  
> -bool vangogh_clk_dpm_is_enabled(struct smu_context *smu,
> +static bool vangogh_clk_dpm_is_enabled(struct smu_context *smu,
>   enum smu_clk_type clk_type)

Ah, I have another patch which will use this function in another file.

Thanks,
Ray

>  {
>   enum smu_feature_mask feature_id = 0;
> -- 
> 1.9.1
>

Re: 5.11-rc1 TTM list corruption

2021-01-05 Thread Huang Rui

On Tue, Jan 05, 2021 at 07:43:51PM +0800, Borislav Petkov wrote:
> On Tue, Jan 05, 2021 at 07:08:52PM +0800, Huang Rui wrote:
> > Ah, this asic is a bit old and still use radeon driver. So we didn't
> > reproduce it on amdgpu driver. I don't have such the old asic in my hand.
> > May we know whether this issue can be duplicated after SI which is used
> > amdgpu module (not sure whether you have recent APU or GPU)?
> 
> The latest I have (I think it is the latest) is:
> 
> [1.826102] [drm] initializing kernel modesetting (RENOIR 0x1002:0x1636 
> 0x17AA:0x5099 0xD1).
> 
> and so far that hasn't triggered it. Which makes sense because that
> thing uses amdgpu:
> 
> [1.810260] [drm] amdgpu kernel modesetting enabled.

Yes! Renoir is late enough for amdgpu kernel module. :-)
Please let us know if you still encounter the issue.

Thanks,
Ray

Re: 5.11-rc1 TTM list corruption

2021-01-05 Thread Huang Rui

On Tue, Jan 05, 2021 at 06:31:38PM +0800, Borislav Petkov wrote:
> Hi,
> 
> On Tue, Jan 05, 2021 at 12:12:13PM +0800, Huang Rui wrote:
> > I am reproducing this issue as well, are you using a Raven board?
> 
> I have no clue what Raven is. The workstation I triggered it once on, has:
> 
> [7.563968] [drm] radeon kernel modesetting enabled.
> [7.581417] [drm] initializing kernel modesetting (CAICOS 0x1002:0x6779 
> 0x174B:0xE164 0x00).

Ah, this asic is a bit old and still use radeon driver. So we didn't
reproduce it on amdgpu driver. I don't have such the old asic in my hand.
May we know whether this issue can be duplicated after SI which is used
amdgpu module (not sure whether you have recent APU or GPU)?

Thanks,
Ray

> [7.609217] [drm] Detected VRAM RAM=2048M, BAR=256M
> [7.614031] [drm] RAM width 64bits DDR
> [7.639665] [drm] radeon: 2048M of VRAM memory ready
> [7.644557] [drm] radeon: 1024M of GTT memory ready.
> [7.649451] [drm] Loading CAICOS Microcode
> [7.653548] [drm] Internal thermal controller without fan control
> [7.661221] [drm] radeon: dpm initialized
> [7.665227] [drm] GART: num cpu pages 262144, num gpu pages 262144
> [7.671821] [drm] enabling PCIE gen 2 link speeds, disable with 
> radeon.pcie_gen2=0
> [7.703858] [drm] PCIE GART of 1024M enabled (table at 0x00162000).
> [7.749689] [drm] radeon: irq initialized.
> [7.769826] [drm] ring test on 0 succeeded in 1 usecs
> [7.774797] [drm] ring test on 3 succeeded in 3 usecs
> [7.955500] [drm] ring test on 5 succeeded in 1 usecs
> [7.960468] [drm] UVD initialized successfully.
> [7.965047] [drm] ib test on ring 0 succeeded in 0 usecs
> [7.970316] [drm] ib test on ring 3 succeeded in 0 usecs
> [8.626877] [drm] ib test on ring 5 succeeded
> [8.631376] [drm] Radeon Display Connectors
> [8.635496] [drm] Connector 0:
> [8.638503] [drm]   HDMI-A-1
> [8.641339] [drm]   HPD2
> [8.643835] [drm]   DDC: 0x6440 0x6440 0x6444 0x6444 0x6448 0x6448 0x644c 
> 0x644c
> [8.651102] [drm]   Encoders:
> [8.654022] [drm] DFP1: INTERNAL_UNIPHY1
> [8.658224] [drm] Connector 1:
> [8.661232] [drm]   DVI-D-1
> [8.663982] [drm]   HPD4
> [8.666479] [drm]   DDC: 0x6460 0x6460 0x6464 0x6464 0x6468 0x6468 0x646c 
> 0x646c
> [8.673745] [drm]   Encoders:
> [8.676665] [drm] DFP2: INTERNAL_UNIPHY
> [8.680782] [drm] Connector 2:
> [8.683789] [drm]   VGA-1
> [8.686369] [drm]   DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 
> 0x643c
> [8.693636] [drm]   Encoders:
> [8.696555] [drm] CRT1: INTERNAL_KLDSCP_DAC1
> [8.788923] [drm] fb mappable at 0xE0363000
> [8.793036] [drm] vram apper at 0xE000
> [8.797064] [drm] size 9216000
> [8.800071] [drm] fb depth is 24
> [8.803249] [drm]pitch is 7680
> [8.807106] fbcon: radeondrmfb (fb0) is primary device
> [8.918927] radeon :1d:00.0: [drm] fb0: radeondrmfb frame buffer device
> [8.938461] [drm] Initialized radeon 2.50.0 20080528 for :1d:00.0 on 
> minor 0
> 
> HTH.
> 
> -- 
> Regards/Gruss,
> Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquettedata=04%7C01%7Cray.huang%40amd.com%7C31b8dcd4040e4a49380e08d8b16517ad%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637454395066317813%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=al4lLGA%2BCdHK4HzO8M5VJthY8Iv71xQ0TsDGwJpgs1A%3Dreserved=0

Re: 5.11-rc1 TTM list corruption

2021-01-04 Thread Huang Rui

On Mon, Jan 04, 2021 at 06:58:02PM +0800, Borislav Petkov wrote:
> On Fri, Jan 01, 2021 at 03:34:28PM +0100, Christian K?nig wrote:
> > Going to double check the code, but can you reproduce this issue
> > reliable?
> 
> Lemme find a test box which can trigger it too - the splat happened
> on my workstation and I'd like to avoid debugging there for obvious
> reasons.

Hi Boris, Christian,

I am reproducing this issue as well, are you using a Raven board?

Thanks,
Ray

> 
> Thx.
> 
> -- 
> Regards/Gruss,
> Boris.
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquettedata=04%7C01%7Cray.huang%40amd.com%7C33b48c914b5b4672ac7308d8b09f9a03%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637453546869304657%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=e3Fj4KGz5n0D9O0zGApDTfstJpNmeu6HSJN2oa8iSKA%3Dreserved=0

Re: [PATCH] PCI: Mark AMD Raven iGPU ATS as broken

2020-11-23 Thread Huang Rui

On Tue, Nov 24, 2020 at 06:51:11AM +0800, Kuehling, Felix wrote:
> On 2020-11-23 5:33 p.m., Will Deacon wrote:
> > On Mon, Nov 23, 2020 at 09:04:14PM +, Deucher, Alexander wrote:
> >> [AMD Public Use]
> >>
> >>> -Original Message-
> >>> From: Will Deacon 
> >>> Sent: Monday, November 23, 2020 8:44 AM
> >>> To: linux-kernel@vger.kernel.org
> >>> Cc: linux-...@vger.kernel.org; io...@lists.linux-foundation.org; Will
> >>> Deacon ; Bjorn Helgaas ;
> >>> Deucher, Alexander ; Edgar Merger
> >>> ; Joerg Roedel 
> >>> Subject: [PATCH] PCI: Mark AMD Raven iGPU ATS as broken
> >>>
> >>> Edgar Merger reports that the AMD Raven GPU does not work reliably on his
> >>> system when the IOMMU is enabled:
> >>>
> >>>| [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
> >>> signaled seq=1, emitted seq=3
> >>>| [...]
> >>>| amdgpu :0b:00.0: GPU reset begin!
> >>>| AMD-Vi: Completion-Wait loop timed out
> >>>| iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT
> >>> device=0b:00.0 address=0x38edc0970]
> >>>
> >>> This is indicative of a hardware/platform configuration issue so, since
> >>> disabling ATS has been shown to resolve the problem, add a quirk to match
> >>> this particular device while Edgar follows-up with AMD for more 
> >>> information.
> >>>
> >>> Cc: Bjorn Helgaas 
> >>> Cc: Alex Deucher 
> >>> Reported-by: Edgar Merger 
> >>> Suggested-by: Joerg Roedel 
> >>> Link:
> >>> https://lore.
> >>> kernel.org/linux-
> >>> iommu/MWHPR10MB1310F042A30661D4158520B589FC0@MWHPR10M
> >>> B1310.namprd10.prod.outlook.com
> >>> her%40amd.com%7C1a883fe14d0c408e7d9508d88fb5df4e%7C3dd8961fe488
> >>> 4e608e11a82d994e183d%7C0%7C0%7C637417358593629699%7CUnknown%7
> >>> CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwi
> >>> LCJXVCI6Mn0%3D%7C1000sdata=TMgKldWzsX8XZ0l7q3%2BszDWXQJJ
> >>> LOUfX5oGaoLN8n%2B8%3Dreserved=0
> >>> Signed-off-by: Will Deacon 
> >>> ---
> >>>
> >>> Hi all,
> >>>
> >>> Since Joerg is away at the moment, I'm posting this to try to make some
> >>> progress with the thread in the Link: tag.
> >> + Felix
> >>
> >> What system is this?  Can you provide more details?  Does a sbios update
> >> fix this?  Disabling ATS for all Ravens will break GPU compute for a lot
> >> of people.  I'd prefer to just black list this particular system (e.g.,
> >> just SSIDs or revision) if possible.
> 
> +Ray
> 
> There are already many systems where the IOMMU is disabled in the BIOS, 
> or the CRAT table reporting the APU compute capabilities is broken. Ray 
> has been working on a fallback to make APUs behave like dGPUs on such 
> systems. That should also cover this case where ATS is blacklisted. That 
> said, it affects the programming model, because we don't support the 
> unified and coherent memory model on dGPUs like we do on APUs with 
> IOMMUv2. So it would be good to make the conditions for this workaround 
> as narrow as possible.

Yes, besides the comments from Alex and Felix, may we get your firmware
version (SMC firmware which is from SBIOS) and device id?

> >>>| [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout,
> >>> signaled seq=1, emitted seq=3

It looks only gfx ib test passed, and fails to lanuch desktop, am I right?

We would like to see whether it is Raven, Raven kicker (new Raven), or
Picasso. In our side, per the internal test result, we didn't see the
similiar issue on Raven kicker and Picasso platform.

Thanks,
Ray

> 
> These are the relevant changes in KFD and Thunk for reference:
> 
> ### KFD ###
> 
> commit 914913ab04dfbcd0226ecb6bc99d276832ea2908
> Author: Huang Rui 
> Date:   Tue Aug 18 14:54:23 2020 +0800
> 
>      drm/amdkfd: implement the dGPU fallback path for apu (v6)
> 
>      We still have a few iommu issues which need to address, so force raven
>      as "dgpu" path for the moment.
> 
>      This is to add the fallback path to bypass IOMMU if IOMMU v2 is 
> disabled
>      or ACPI CRAT table not correct.
> 
>      v2: Use ignore_crat parameter to decide whether it will go with 
> IOMMUv2.
>      v3: Align with existed thunk, don't change the way of raven, only 
> renoir
>      will use "dgpu" path by default.
>      v4: don't u

Re: [PATCH v2 -next] drm/amdkfd: Fix -Wunused-const-variable warning

2020-09-10 Thread Huang Rui

On Thu, Sep 10, 2020 at 03:50:06PM +0800, YueHaibing wrote:
> If KFD_SUPPORT_IOMMU_V2 is not set, gcc warns:
> 
> drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device.c:121:37: warning: 
> ‘raven_device_info’ defined but not used [-Wunused-const-variable=]
>  static const struct kfd_device_info raven_device_info = {
>  ^~~~~~~~~
> 
> As Huang Rui suggested, Raven already has the fallback path,
> so it should be out of IOMMU v2 flag.
> 
> Suggested-by: Huang Rui 
> Signed-off-by: YueHaibing 

Acked-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 0e71a0543f98..e3fc6ed7b79c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -503,8 +503,8 @@ static const struct kfd_device_info 
> *kfd_supported_devices[][2] = {
>  #ifdef KFD_SUPPORT_IOMMU_V2
>   [CHIP_KAVERI] = {_device_info, NULL},
>   [CHIP_CARRIZO] = {_device_info, NULL},
> - [CHIP_RAVEN] = {_device_info, NULL},
>  #endif
> + [CHIP_RAVEN] = {_device_info, NULL},
>   [CHIP_HAWAII] = {_device_info, NULL},
>   [CHIP_TONGA] = {_device_info, NULL},
>   [CHIP_FIJI] = {_device_info, _vf_device_info},
> -- 
> 2.17.1
> 
>

Re: [PATCH -next] drm/amdkfd: Fix -Wunused-const-variable warning

2020-09-10 Thread Huang Rui

On Thu, Sep 10, 2020 at 10:55:32AM +0800, YueHaibing wrote:
> If KFD_SUPPORT_IOMMU_V2 is not set, gcc warns:
> 
> drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device.c:121:37: warning: 
> ‘raven_device_info’ defined but not used [-Wunused-const-variable=]
>  static const struct kfd_device_info raven_device_info = {
>  ^
> 
> Move it to ifdef block.
> 
> Signed-off-by: YueHaibing 
> ---

Raven already has the fallback path, so it should be out of IOMMU v2 flag.

You may want to move raven_device_info out of IOMMU v2 flag in 
kfd_supported_devices[][2] as well.

Thanks,
Ray

>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 0e71a0543f98..cae4df259e26 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -116,7 +116,6 @@ static const struct kfd_device_info carrizo_device_info = 
> {
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 2,
>  };
> -#endif
>  
>  static const struct kfd_device_info raven_device_info = {
>   .asic_family = CHIP_RAVEN,
> @@ -135,6 +134,7 @@ static const struct kfd_device_info raven_device_info = {
>   .num_xgmi_sdma_engines = 0,
>   .num_sdma_queues_per_engine = 2,
>  };
> +#endif
>  
>  static const struct kfd_device_info hawaii_device_info = {
>   .asic_family = CHIP_HAWAII,
> -- 
> 2.17.1
> 
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-develdata=02%7C01%7Cray.huang%40amd.com%7Ce8805fd43e9a4ad33cd008d8555a567a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637353193795034052sdata=HqUGjUpQtfCkZ3wA1%2F6HTCZrn%2F4%2BuQORTobzaIYa%2BNs%3Dreserved=0

Re: [PATCH] drm/amdgpu/powerplay: Fix missing break in switch statement

2019-03-03 Thread Huang Rui

On Fri, Mar 01, 2019 at 03:29:43PM -0600, Gustavo A. R. Silva wrote:
> Add missing break statement in order to prevent the code from falling
> through to case SMU_Discrete_DpmTable.
> 
> This bug was found thanks to the ongoing efforts to enable
> -Wimplicit-fallthrough.
> 
> Fixes: 34a564eaf528 ("drm/amd/powerplay: implement fw image related smum 
> interface for Polaris.")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Gustavo A. R. Silva 

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
> index 52abca065764..222fb79d319e 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
> @@ -2330,6 +2330,7 @@ static uint32_t polaris10_get_offsetof(uint32_t type, 
> uint32_t member)
>   case DRAM_LOG_BUFF_SIZE:
>   return offsetof(SMU74_SoftRegisters, 
> DRAM_LOG_BUFF_SIZE);
>   }
> + break;
>   case SMU_Discrete_DpmTable:
>   switch (member) {
>   case UvdBootLevel:
> -- 
> 2.21.0
> 
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000010da43f last fence id 0x00000000010da52d on ring 0)

2018-06-06 Thread Huang Rui

On Tue, Jun 05, 2018 at 04:44:04PM +0200, Borislav Petkov wrote:
> Hi guys,
> 
> X just froze here ontop of 4.17-rc7+ tip/master (kernel is from last
> week) with the splat at the end.
> 
> Box is a x470 chipset with Ryzen 2700X.
> 
> GPU gets detected as
> 
> [7.440971] [drm] radeon kernel modesetting enabled.
> [7.441220] [drm] initializing kernel modesetting (RV635 0x1002:0x9598 
> 0x1043:0x01DA 0x00).
> [7.441328] ATOM BIOS: 9598.10.88.0.3.AS05
> [7.441395] radeon :1d:00.0: VRAM: 512M 0x - 
> 0x1FFF (512M used)
> [7.441464] radeon :1d:00.0: GTT: 512M 0x2000 - 
> 0x3FFF
> [7.441531] [drm] Detected VRAM RAM=512M, BAR=256M
> [7.441588] [drm] RAM width 128bits DDR
> [7.441690] [TTM] Zone  kernel: Available graphics memory: 16462214 kiB
> [7.441751] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
> [7.441811] [TTM] Initializing pool allocator
> [7.441868] [TTM] Initializing DMA pool allocator
> [7.441934] [drm] radeon: 512M of VRAM memory ready
> [7.441990] [drm] radeon: 512M of GTT memory ready.
> [7.442050] [drm] Loading RV635 Microcode
> [7.442865] [drm] Internal thermal controller without fan control
> [7.442940] [drm] radeon: power management initialized
> [7.443222] [drm] GART: num cpu pages 131072, num gpu pages 131072
> [7.443487] [drm] enabling PCIE gen 2 link speeds, disable with 
> radeon.pcie_gen2=0
> [7.477319] [drm] PCIE GART of 512M enabled (table at 0x00142000).
> [7.477400] radeon :1d:00.0: WB enabled
> [7.477455] radeon :1d:00.0: fence driver on ring 0 use gpu addr 
> 0x2c00 and cpu addr 0x(ptrval)
> [7.477708] radeon :1d:00.0: fence driver on ring 5 use gpu addr 
> 0x000521d0 and cpu addr 0x(ptrval)
> [7.48] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> [7.477836] [drm] Driver supports precise vblank timestamp query.
> [7.477896] radeon :1d:00.0: radeon: MSI limited to 32-bit
> [7.477990] radeon :1d:00.0: radeon: using MSI.
> [7.478062] [drm] radeon: irq initialized.
> [7.509056] [drm] ring test on 0 succeeded in 0 usecs
> [7.683793] [drm] ring test on 5 succeeded in 1 usecs
> [7.683853] [drm] UVD initialized successfully.
> [7.684009] [drm] ib test on ring 0 succeeded in 0 usecs
> [8.348466] [drm] ib test on ring 5 succeeded
> [8.348921] [drm] Radeon Display Connectors
> [8.348978] [drm] Connector 0:
> [8.349031] [drm]   DVI-I-1
> [8.349082] [drm]   HPD1
> [8.349135] [drm]   DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 
> 0x7e5c
> [8.349200] [drm]   Encoders:
> [8.349252] [drm] DFP1: INTERNAL_UNIPHY
> [8.349308] [drm] CRT2: INTERNAL_KLDSCP_DAC2
> [8.349364] [drm] Connector 1:
> [8.349416] [drm]   DIN-1
> [8.349467] [drm]   Encoders:
> [8.349520] [drm] TV1: INTERNAL_KLDSCP_DAC2
> [8.349576] [drm] Connector 2:
> [8.349628] [drm]   DVI-I-2
> [8.349680] [drm]   HPD2
> [8.349732] [drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 
> 0x7e4c
> [8.349797] [drm]   Encoders:
> [8.349849] [drm] CRT1: INTERNAL_KLDSCP_DAC1
> [8.349905] [drm] DFP2: INTERNAL_KLDSCP_LVTMA
> [8.430521] [drm] fb mappable at 0xE0243000
> [8.430575] [drm] vram apper at 0xE000
> [8.431194] [drm] size 9216000
> [8.431245] [drm] fb depth is 24
> [8.431295] [drm]pitch is 7680
> [8.431406] fbcon: radeondrmfb (fb0) is primary device
> [8.496928] Console: switching to colour frame buffer device 240x75
> [8.501851] radeon :1d:00.0: fb0: radeondrmfb frame buffer device
> [8.520179] [drm] Initialized radeon 2.50.0 20080528 for :1d:00.0 on 
> minor 0
> 
> in the PCIe slot with two monitors connected to it. radeon firmware is
> 
> Version: 20170823-1
> 
> What practically happened is X froze and got restarted after the GPU
> reset. It seems to be ok now, as I'm typing in it.
> 
> Thoughts?
> 
> [197439.022249] Restarting tasks ... done.
> [197439.024043] PM: hibernation exit
> [197439.058296] r8169 :18:00.0 eth0: link up
> [200941.240184] perf: interrupt took too long (2507 > 2500), lowering 
> kernel.perf_event_max_sample_rate to 79750
> [221973.686894] radeon :1d:00.0: ring 0 stalled for more than 10176msec
> [221973.686900] radeon :1d:00.0: GPU lockup (current fence id 
> 0x010da43f last fence id 0x010da52d on ring 0)
> [221973.686929] radeon :1d:00.0: failed to get a new IB (-35)
> [221973.686950] [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to get ib !
> [221973.693971] radeon :1d:00.0: Saved 7609 dwords of commands on ring 0.
> [221973.693985] radeon :1d:00.0: GPU softreset: 0x0008
> [221973.693988] radeon :1d:00.0:   R_008010_GRBM_STATUS  = 0xA0001030
> [221973.693990] radeon :1d:00.0:   R_008014_GRBM_STATUS2 = 0x0003
>

Re: radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000010da43f last fence id 0x00000000010da52d on ring 0)

2018-06-06 Thread Huang Rui

On Tue, Jun 05, 2018 at 04:44:04PM +0200, Borislav Petkov wrote:
> Hi guys,
> 
> X just froze here ontop of 4.17-rc7+ tip/master (kernel is from last
> week) with the splat at the end.
> 
> Box is a x470 chipset with Ryzen 2700X.
> 
> GPU gets detected as
> 
> [7.440971] [drm] radeon kernel modesetting enabled.
> [7.441220] [drm] initializing kernel modesetting (RV635 0x1002:0x9598 
> 0x1043:0x01DA 0x00).
> [7.441328] ATOM BIOS: 9598.10.88.0.3.AS05
> [7.441395] radeon :1d:00.0: VRAM: 512M 0x - 
> 0x1FFF (512M used)
> [7.441464] radeon :1d:00.0: GTT: 512M 0x2000 - 
> 0x3FFF
> [7.441531] [drm] Detected VRAM RAM=512M, BAR=256M
> [7.441588] [drm] RAM width 128bits DDR
> [7.441690] [TTM] Zone  kernel: Available graphics memory: 16462214 kiB
> [7.441751] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
> [7.441811] [TTM] Initializing pool allocator
> [7.441868] [TTM] Initializing DMA pool allocator
> [7.441934] [drm] radeon: 512M of VRAM memory ready
> [7.441990] [drm] radeon: 512M of GTT memory ready.
> [7.442050] [drm] Loading RV635 Microcode
> [7.442865] [drm] Internal thermal controller without fan control
> [7.442940] [drm] radeon: power management initialized
> [7.443222] [drm] GART: num cpu pages 131072, num gpu pages 131072
> [7.443487] [drm] enabling PCIE gen 2 link speeds, disable with 
> radeon.pcie_gen2=0
> [7.477319] [drm] PCIE GART of 512M enabled (table at 0x00142000).
> [7.477400] radeon :1d:00.0: WB enabled
> [7.477455] radeon :1d:00.0: fence driver on ring 0 use gpu addr 
> 0x2c00 and cpu addr 0x(ptrval)
> [7.477708] radeon :1d:00.0: fence driver on ring 5 use gpu addr 
> 0x000521d0 and cpu addr 0x(ptrval)
> [7.48] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> [7.477836] [drm] Driver supports precise vblank timestamp query.
> [7.477896] radeon :1d:00.0: radeon: MSI limited to 32-bit
> [7.477990] radeon :1d:00.0: radeon: using MSI.
> [7.478062] [drm] radeon: irq initialized.
> [7.509056] [drm] ring test on 0 succeeded in 0 usecs
> [7.683793] [drm] ring test on 5 succeeded in 1 usecs
> [7.683853] [drm] UVD initialized successfully.
> [7.684009] [drm] ib test on ring 0 succeeded in 0 usecs
> [8.348466] [drm] ib test on ring 5 succeeded
> [8.348921] [drm] Radeon Display Connectors
> [8.348978] [drm] Connector 0:
> [8.349031] [drm]   DVI-I-1
> [8.349082] [drm]   HPD1
> [8.349135] [drm]   DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 
> 0x7e5c
> [8.349200] [drm]   Encoders:
> [8.349252] [drm] DFP1: INTERNAL_UNIPHY
> [8.349308] [drm] CRT2: INTERNAL_KLDSCP_DAC2
> [8.349364] [drm] Connector 1:
> [8.349416] [drm]   DIN-1
> [8.349467] [drm]   Encoders:
> [8.349520] [drm] TV1: INTERNAL_KLDSCP_DAC2
> [8.349576] [drm] Connector 2:
> [8.349628] [drm]   DVI-I-2
> [8.349680] [drm]   HPD2
> [8.349732] [drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 
> 0x7e4c
> [8.349797] [drm]   Encoders:
> [8.349849] [drm] CRT1: INTERNAL_KLDSCP_DAC1
> [8.349905] [drm] DFP2: INTERNAL_KLDSCP_LVTMA
> [8.430521] [drm] fb mappable at 0xE0243000
> [8.430575] [drm] vram apper at 0xE000
> [8.431194] [drm] size 9216000
> [8.431245] [drm] fb depth is 24
> [8.431295] [drm]pitch is 7680
> [8.431406] fbcon: radeondrmfb (fb0) is primary device
> [8.496928] Console: switching to colour frame buffer device 240x75
> [8.501851] radeon :1d:00.0: fb0: radeondrmfb frame buffer device
> [8.520179] [drm] Initialized radeon 2.50.0 20080528 for :1d:00.0 on 
> minor 0
> 
> in the PCIe slot with two monitors connected to it. radeon firmware is
> 
> Version: 20170823-1
> 
> What practically happened is X froze and got restarted after the GPU
> reset. It seems to be ok now, as I'm typing in it.
> 
> Thoughts?
> 
> [197439.022249] Restarting tasks ... done.
> [197439.024043] PM: hibernation exit
> [197439.058296] r8169 :18:00.0 eth0: link up
> [200941.240184] perf: interrupt took too long (2507 > 2500), lowering 
> kernel.perf_event_max_sample_rate to 79750
> [221973.686894] radeon :1d:00.0: ring 0 stalled for more than 10176msec
> [221973.686900] radeon :1d:00.0: GPU lockup (current fence id 
> 0x010da43f last fence id 0x010da52d on ring 0)
> [221973.686929] radeon :1d:00.0: failed to get a new IB (-35)
> [221973.686950] [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to get ib !
> [221973.693971] radeon :1d:00.0: Saved 7609 dwords of commands on ring 0.
> [221973.693985] radeon :1d:00.0: GPU softreset: 0x0008
> [221973.693988] radeon :1d:00.0:   R_008010_GRBM_STATUS  = 0xA0001030
> [221973.693990] radeon :1d:00.0:   R_008014_GRBM_STATUS2 = 0x0003
>

Re: [PATCH] drm/radeon: change function signature to pass full range

2018-04-13 Thread Huang Rui

On Thu, Apr 12, 2018 at 09:33:33PM +0200, Mathieu Malaterre wrote:
> In function ‘radeon_process_i2c_ch’ a comparison of a u8 value against
> 255 is done. Since it is always false, change the signature of this
> function to use an `int` instead, which match the type used in caller:
> `radeon_atom_hw_i2c_xfer`.
> 
> Fix the following warning triggered with W=1:
> 
>   CC [M]  drivers/gpu/drm/radeon/atombios_i2c.o
>   drivers/gpu/drm/radeon/atombios_i2c.c: In function ‘radeon_process_i2c_ch’:
>   drivers/gpu/drm/radeon/atombios_i2c.c:71:11: warning: comparison is always 
> false due to limited range of data type [-Wtype-limits]
>if (num > ATOM_MAX_HW_I2C_READ) {
>^
> 
> Signed-off-by: Mathieu Malaterre <ma...@debian.org>

Reviewed-by: Huang Rui <ray.hu...@amd.com>

> ---
>  drivers/gpu/drm/radeon/atombios_i2c.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/radeon/atombios_i2c.c 
> b/drivers/gpu/drm/radeon/atombios_i2c.c
> index 4157780585a0..9022e9af11a0 100644
> --- a/drivers/gpu/drm/radeon/atombios_i2c.c
> +++ b/drivers/gpu/drm/radeon/atombios_i2c.c
> @@ -35,7 +35,7 @@
>  
>  static int radeon_process_i2c_ch(struct radeon_i2c_chan *chan,
>u8 slave_addr, u8 flags,
> -  u8 *buf, u8 num)
> +  u8 *buf, int num)
>  {
>   struct drm_device *dev = chan->dev;
>   struct radeon_device *rdev = dev->dev_private;
> -- 
> 2.11.0
> 
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/radeon: change function signature to pass full range

2018-04-13 Thread Huang Rui

On Thu, Apr 12, 2018 at 09:33:33PM +0200, Mathieu Malaterre wrote:
> In function ‘radeon_process_i2c_ch’ a comparison of a u8 value against
> 255 is done. Since it is always false, change the signature of this
> function to use an `int` instead, which match the type used in caller:
> `radeon_atom_hw_i2c_xfer`.
> 
> Fix the following warning triggered with W=1:
> 
>   CC [M]  drivers/gpu/drm/radeon/atombios_i2c.o
>   drivers/gpu/drm/radeon/atombios_i2c.c: In function ‘radeon_process_i2c_ch’:
>   drivers/gpu/drm/radeon/atombios_i2c.c:71:11: warning: comparison is always 
> false due to limited range of data type [-Wtype-limits]
>if (num > ATOM_MAX_HW_I2C_READ) {
>^
> 
> Signed-off-by: Mathieu Malaterre 

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/radeon/atombios_i2c.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/radeon/atombios_i2c.c 
> b/drivers/gpu/drm/radeon/atombios_i2c.c
> index 4157780585a0..9022e9af11a0 100644
> --- a/drivers/gpu/drm/radeon/atombios_i2c.c
> +++ b/drivers/gpu/drm/radeon/atombios_i2c.c
> @@ -35,7 +35,7 @@
>  
>  static int radeon_process_i2c_ch(struct radeon_i2c_chan *chan,
>u8 slave_addr, u8 flags,
> -  u8 *buf, u8 num)
> +  u8 *buf, int num)
>  {
>   struct drm_device *dev = chan->dev;
>   struct radeon_device *rdev = dev->dev_private;
> -- 
> 2.11.0
> 
> ___
> amd-gfx mailing list
> amd-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH V2] drm/amdgpu: limit DMA size to PAGE_SIZE for scatter-gather buffers

2018-04-11 Thread Huang Rui

On Tue, Apr 10, 2018 at 04:59:55PM -0400, Sinan Kaya wrote:
> Code is expecing to observe the same number of buffers returned from
> dma_map_sg() function compared to sg_alloc_table_from_pages(). This
> doesn't hold true universally especially for systems with IOMMU.
> 
> IOMMU driver tries to combine buffers into a single DMA address as much
> as it can. The right thing is to tell the DMA layer how much combining
> IOMMU can do.
> 
> Signed-off-by: Sinan Kaya 

Sinan, I guess Christian's suggestion is to add amdgpu_device_init function
like here:

8<---
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 0e798b3..9b96771 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2339,6 +2339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
/* init the mode config */
drm_mode_config_init(adev->ddev);

+   dma_set_max_seg_size(adev->dev, PAGE_SIZE);
+
r = amdgpu_device_ip_init(adev);
if (r) {
/* failed in exclusive mode due to timeout */
8<---

After that, we don't do it in each generation.

Thanks,
Ray

Re: [PATCH V2] drm/amdgpu: limit DMA size to PAGE_SIZE for scatter-gather buffers

2018-04-11 Thread Huang Rui

On Tue, Apr 10, 2018 at 04:59:55PM -0400, Sinan Kaya wrote:
> Code is expecing to observe the same number of buffers returned from
> dma_map_sg() function compared to sg_alloc_table_from_pages(). This
> doesn't hold true universally especially for systems with IOMMU.
> 
> IOMMU driver tries to combine buffers into a single DMA address as much
> as it can. The right thing is to tell the DMA layer how much combining
> IOMMU can do.
> 
> Signed-off-by: Sinan Kaya 

Sinan, I guess Christian's suggestion is to add amdgpu_device_init function
like here:

8<---
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 0e798b3..9b96771 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2339,6 +2339,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
/* init the mode config */
drm_mode_config_init(adev->ddev);

+   dma_set_max_seg_size(adev->dev, PAGE_SIZE);
+
r = amdgpu_device_ip_init(adev);
if (r) {
/* failed in exclusive mode due to timeout */
8<---

After that, we don't do it in each generation.

Thanks,
Ray

Re: perf/x86/amd/power missing pmu::module initialisation

2017-11-07 Thread Huang Rui

On Tue, Nov 07, 2017 at 08:50:52PM +0800, Huang Rui wrote:
> Hi Mark,
> 
> On Fri, Nov 03, 2017 at 11:33:56AM +, Mark Rutland wrote:
> > Hi,
> > 
> > As a heads-up, I believe that arch/x86/events/amd/power.c is currently
> > missing initialisation of pmu::module, which could result in problems
> > when the module is unloaded.
> > 
> > We've just spotted a similar problem in a couple of ARM PMUs, and AFAICT
> > the AMD power PMU is the only other PMU affected.
> > 
> > Other modular PMUs all initialise this to THIS_MODULE prior to calling
> > perf_pmu_register().
> > 
> 
> Thanks to let me know. I have dived into pmu for quite a while. Could you

Fix typo -> "I haven't dived into ..."

Thanks,
Rui

Re: perf/x86/amd/power missing pmu::module initialisation

2017-11-07 Thread Huang Rui

On Tue, Nov 07, 2017 at 08:50:52PM +0800, Huang Rui wrote:
> Hi Mark,
> 
> On Fri, Nov 03, 2017 at 11:33:56AM +, Mark Rutland wrote:
> > Hi,
> > 
> > As a heads-up, I believe that arch/x86/events/amd/power.c is currently
> > missing initialisation of pmu::module, which could result in problems
> > when the module is unloaded.
> > 
> > We've just spotted a similar problem in a couple of ARM PMUs, and AFAICT
> > the AMD power PMU is the only other PMU affected.
> > 
> > Other modular PMUs all initialise this to THIS_MODULE prior to calling
> > perf_pmu_register().
> > 
> 
> Thanks to let me know. I have dived into pmu for quite a while. Could you

Fix typo -> "I haven't dived into ..."

Thanks,
Rui

Re: perf/x86/amd/power missing pmu::module initialisation

2017-11-07 Thread Huang Rui

Hi Mark,

On Fri, Nov 03, 2017 at 11:33:56AM +, Mark Rutland wrote:
> Hi,
> 
> As a heads-up, I believe that arch/x86/events/amd/power.c is currently
> missing initialisation of pmu::module, which could result in problems
> when the module is unloaded.
> 
> We've just spotted a similar problem in a couple of ARM PMUs, and AFAICT
> the AMD power PMU is the only other PMU affected.
> 
> Other modular PMUs all initialise this to THIS_MODULE prior to calling
> perf_pmu_register().
> 

Thanks to let me know. I have dived into pmu for quite a while. Could you
please share more information or forward related mail thread, so that I can
repro this issue in my side. :-)

Thanks,
Rui

Re: perf/x86/amd/power missing pmu::module initialisation

2017-11-07 Thread Huang Rui

Hi Mark,

On Fri, Nov 03, 2017 at 11:33:56AM +, Mark Rutland wrote:
> Hi,
> 
> As a heads-up, I believe that arch/x86/events/amd/power.c is currently
> missing initialisation of pmu::module, which could result in problems
> when the module is unloaded.
> 
> We've just spotted a similar problem in a couple of ARM PMUs, and AFAICT
> the AMD power PMU is the only other PMU affected.
> 
> Other modular PMUs all initialise this to THIS_MODULE prior to calling
> perf_pmu_register().
> 

Thanks to let me know. I have dived into pmu for quite a while. Could you
please share more information or forward related mail thread, so that I can
repro this issue in my side. :-)

Thanks,
Rui

Re: [PATCH 65/66] hwmon: (fam15h_power) use permission-specific DEVICE_ATTR variants

2016-12-22 Thread Huang Rui

On Thu, Dec 22, 2016 at 01:05:34PM +0100, Julia Lawall wrote:
> Use DEVICE_ATTR_RO etc. for read only attributes etc.  This simplifies the
> source code, improves readbility, and reduces the chance of
> inconsistencies.
> 
> The semantic patch for the RO case, in the case where the show function
> already has the expected name, is as follows:
> (http://coccinelle.lip6.fr/)
> 
> // 
> @ro@
> declarer name DEVICE_ATTR;
> identifier x,x_show;
> @@
> 
> DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL);
> 
> @script:ocaml@
> x << ro.x;
> x_show << ro.x_show;
> @@
> 
> if not (x^"_show" = x_show) then Coccilib.include_match false
> 
> @@
> declarer name DEVICE_ATTR_RO;
> identifier ro.x,ro.x_show;
> @@
> 
> - DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL);
> + DEVICE_ATTR_RO(x);
> // 
> 
> Signed-off-by: Julia Lawall <julia.law...@lip6.fr>
> 

Acked-by: Huang Rui <ray.hu...@amd.com>

Thanks,
Rui

> ---
>  drivers/hwmon/fam15h_power.c |   34 --
>  1 file changed, 16 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
> index 15aa49d..9545a34 100644
> --- a/drivers/hwmon/fam15h_power.c
> +++ b/drivers/hwmon/fam15h_power.c
> @@ -83,8 +83,8 @@ static bool is_carrizo_or_later(void)
>   return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60;
>  }
>  
> -static ssize_t show_power(struct device *dev,
> -   struct device_attribute *attr, char *buf)
> +static ssize_t power1_input_show(struct device *dev,
> +  struct device_attribute *attr, char *buf)
>  {
>   u32 val, tdp_limit, running_avg_range;
>   s32 running_avg_capture;
> @@ -136,16 +136,16 @@ static ssize_t show_power(struct device *dev,
>   curr_pwr_watts = (curr_pwr_watts * 15625) >> (10 + running_avg_range);
>   return sprintf(buf, "%u\n", (unsigned int) curr_pwr_watts);
>  }
> -static DEVICE_ATTR(power1_input, S_IRUGO, show_power, NULL);
> +static DEVICE_ATTR_RO(power1_input);
>  
> -static ssize_t show_power_crit(struct device *dev,
> -struct device_attribute *attr, char *buf)
> +static ssize_t power1_crit_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
>  {
>   struct fam15h_power_data *data = dev_get_drvdata(dev);
>  
>   return sprintf(buf, "%u\n", data->processor_pwr_watts);
>  }
> -static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL);
> +static DEVICE_ATTR_RO(power1_crit);
>  
>  static void do_read_registers_on_cu(void *_data)
>  {
> @@ -212,9 +212,8 @@ static int read_registers(struct fam15h_power_data *data)
>   return 0;
>  }
>  
> -static ssize_t acc_show_power(struct device *dev,
> -   struct device_attribute *attr,
> -   char *buf)
> +static ssize_t power1_average_show(struct device *dev,
> +struct device_attribute *attr, char *buf)
>  {
>   struct fam15h_power_data *data = dev_get_drvdata(dev);
>   u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS],
> @@ -267,20 +266,20 @@ static ssize_t acc_show_power(struct device *dev,
>  
>   return sprintf(buf, "%llu\n", (unsigned long long)avg_acc);
>  }
> -static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL);
> +static DEVICE_ATTR_RO(power1_average);
>  
> -static ssize_t acc_show_power_period(struct device *dev,
> -  struct device_attribute *attr,
> -  char *buf)
> +static ssize_t power1_average_interval_show(struct device *dev,
> + struct device_attribute *attr,
> + char *buf)
>  {
>   struct fam15h_power_data *data = dev_get_drvdata(dev);
>  
>   return sprintf(buf, "%lu\n", data->power_period);
>  }
>  
> -static ssize_t acc_set_power_period(struct device *dev,
> - struct device_attribute *attr,
> - const char *buf, size_t count)
> +static ssize_t power1_average_interval_store(struct device *dev,
> +  struct device_attribute *attr,
> +  const char *buf, size_t count)
>  {
>   struct fam15h_power_data *data = dev_get_drvdata(dev);
>   unsigned long temp;
> @@ -301,8 +300,7 @@ static ssize_t acc_set_power_period(struct device *dev,
>  
>   return count;
>  }
> -static DEVICE_ATTR(power1_average_interval, S_IRUGO | S_IWUSR,
> -acc_show_power_period, acc_set_power_period);
> +static DEVICE_ATTR_RW(power1_average_interval);
>  
>  static int fam15h_power_init_attrs(struct pci_dev *pdev,
>  struct fam15h_power_data *data)
>

Re: [PATCH 65/66] hwmon: (fam15h_power) use permission-specific DEVICE_ATTR variants

2016-12-22 Thread Huang Rui

On Thu, Dec 22, 2016 at 01:05:34PM +0100, Julia Lawall wrote:
> Use DEVICE_ATTR_RO etc. for read only attributes etc.  This simplifies the
> source code, improves readbility, and reduces the chance of
> inconsistencies.
> 
> The semantic patch for the RO case, in the case where the show function
> already has the expected name, is as follows:
> (http://coccinelle.lip6.fr/)
> 
> // 
> @ro@
> declarer name DEVICE_ATTR;
> identifier x,x_show;
> @@
> 
> DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL);
> 
> @script:ocaml@
> x << ro.x;
> x_show << ro.x_show;
> @@
> 
> if not (x^"_show" = x_show) then Coccilib.include_match false
> 
> @@
> declarer name DEVICE_ATTR_RO;
> identifier ro.x,ro.x_show;
> @@
> 
> - DEVICE_ATTR(x, \(0444\|S_IRUGO\), x_show, NULL);
> + DEVICE_ATTR_RO(x);
> // 
> 
> Signed-off-by: Julia Lawall 
> 

Acked-by: Huang Rui 

Thanks,
Rui

> ---
>  drivers/hwmon/fam15h_power.c |   34 --
>  1 file changed, 16 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
> index 15aa49d..9545a34 100644
> --- a/drivers/hwmon/fam15h_power.c
> +++ b/drivers/hwmon/fam15h_power.c
> @@ -83,8 +83,8 @@ static bool is_carrizo_or_later(void)
>   return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60;
>  }
>  
> -static ssize_t show_power(struct device *dev,
> -   struct device_attribute *attr, char *buf)
> +static ssize_t power1_input_show(struct device *dev,
> +  struct device_attribute *attr, char *buf)
>  {
>   u32 val, tdp_limit, running_avg_range;
>   s32 running_avg_capture;
> @@ -136,16 +136,16 @@ static ssize_t show_power(struct device *dev,
>   curr_pwr_watts = (curr_pwr_watts * 15625) >> (10 + running_avg_range);
>   return sprintf(buf, "%u\n", (unsigned int) curr_pwr_watts);
>  }
> -static DEVICE_ATTR(power1_input, S_IRUGO, show_power, NULL);
> +static DEVICE_ATTR_RO(power1_input);
>  
> -static ssize_t show_power_crit(struct device *dev,
> -struct device_attribute *attr, char *buf)
> +static ssize_t power1_crit_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
>  {
>   struct fam15h_power_data *data = dev_get_drvdata(dev);
>  
>   return sprintf(buf, "%u\n", data->processor_pwr_watts);
>  }
> -static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL);
> +static DEVICE_ATTR_RO(power1_crit);
>  
>  static void do_read_registers_on_cu(void *_data)
>  {
> @@ -212,9 +212,8 @@ static int read_registers(struct fam15h_power_data *data)
>   return 0;
>  }
>  
> -static ssize_t acc_show_power(struct device *dev,
> -   struct device_attribute *attr,
> -   char *buf)
> +static ssize_t power1_average_show(struct device *dev,
> +struct device_attribute *attr, char *buf)
>  {
>   struct fam15h_power_data *data = dev_get_drvdata(dev);
>   u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS],
> @@ -267,20 +266,20 @@ static ssize_t acc_show_power(struct device *dev,
>  
>   return sprintf(buf, "%llu\n", (unsigned long long)avg_acc);
>  }
> -static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL);
> +static DEVICE_ATTR_RO(power1_average);
>  
> -static ssize_t acc_show_power_period(struct device *dev,
> -  struct device_attribute *attr,
> -  char *buf)
> +static ssize_t power1_average_interval_show(struct device *dev,
> + struct device_attribute *attr,
> + char *buf)
>  {
>   struct fam15h_power_data *data = dev_get_drvdata(dev);
>  
>   return sprintf(buf, "%lu\n", data->power_period);
>  }
>  
> -static ssize_t acc_set_power_period(struct device *dev,
> - struct device_attribute *attr,
> - const char *buf, size_t count)
> +static ssize_t power1_average_interval_store(struct device *dev,
> +  struct device_attribute *attr,
> +  const char *buf, size_t count)
>  {
>   struct fam15h_power_data *data = dev_get_drvdata(dev);
>   unsigned long temp;
> @@ -301,8 +300,7 @@ static ssize_t acc_set_power_period(struct device *dev,
>  
>   return count;
>  }
> -static DEVICE_ATTR(power1_average_interval, S_IRUGO | S_IWUSR,
> -acc_show_power_period, acc_set_power_period);
> +static DEVICE_ATTR_RW(power1_average_interval);
>  
>  static int fam15h_power_init_attrs(struct pci_dev *pdev,
>  struct fam15h_power_data *data)
>

Re: [PATCH] drm/amd/powerplay/smumgr: mark symbols static where possible

2016-09-19 Thread Huang Rui

Hi Baoyou,

On Mon, Sep 19, 2016 at 03:28:39PM +0800, Baoyou Xie wrote:
> We get a few warnings when building kernel with W=1:
> drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/tonga_smumgr.c:146:5: warning: 
> no previous prototype for 'tonga_program_jump_on_start' [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/fiji_smumgr.c:816:5: warning: 
> no previous prototype for 'fiji_save_vft_table' [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/polaris10_smumgr.c:743:5: 
> warning: no previous prototype for 'polaris10_avfs_event_mgr' 
> [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/iceland_smumgr.c:167:5: 
> warning: no previous prototype for 'iceland_program_jump_on_start' 
> [-Wmissing-prototypes]
> 
> 
> In fact, these functions are only used in the file in which they are
> declared and don't need a declaration, but can be made static.
> So this patch marks these functions with 'static'.
> 

+ amd-gfx mail list.

Thanks to your patch. But powerplay part is refactored recently below, some
functions are already updated. So you might need rebase them. :-)

https://lists.freedesktop.org/archives/amd-gfx/2016-September/002118.html

Thanks,
Rui

Re: [PATCH] drm/amd/powerplay/smumgr: mark symbols static where possible

2016-09-19 Thread Huang Rui

Hi Baoyou,

On Mon, Sep 19, 2016 at 03:28:39PM +0800, Baoyou Xie wrote:
> We get a few warnings when building kernel with W=1:
> drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/tonga_smumgr.c:146:5: warning: 
> no previous prototype for 'tonga_program_jump_on_start' [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/fiji_smumgr.c:816:5: warning: 
> no previous prototype for 'fiji_save_vft_table' [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/polaris10_smumgr.c:743:5: 
> warning: no previous prototype for 'polaris10_avfs_event_mgr' 
> [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/../powerplay/smumgr/iceland_smumgr.c:167:5: 
> warning: no previous prototype for 'iceland_program_jump_on_start' 
> [-Wmissing-prototypes]
> 
> 
> In fact, these functions are only used in the file in which they are
> declared and don't need a declaration, but can be made static.
> So this patch marks these functions with 'static'.
> 

+ amd-gfx mail list.

Thanks to your patch. But powerplay part is refactored recently below, some
functions are already updated. So you might need rebase them. :-)

https://lists.freedesktop.org/archives/amd-gfx/2016-September/002118.html

Thanks,
Rui

Re: [PATCH] drm: amdgpu: mark symbols static where possible

2016-09-03 Thread Huang Rui

On Sat, Sep 03, 2016 at 01:57:14PM +0800, Baoyou Xie wrote:
> We get a few warnings when building kernel with W=1:
> drivers/gpu/drm/amd/amdgpu/cz_smc.c:51:5: warning: no previous prototype for 
> 'cz_send_msg_to_smc_async' [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/cz_smc.c:143:5: warning: no previous prototype for 
> 'cz_write_smc_sram_dword' [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/iceland_smc.c:124:6: warning: no previous 
> prototype for 'iceland_start_smc' [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:3926:6: warning: no previous prototype 
> for 'gfx_v8_0_rlc_stop' [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:94:6: warning: no previous prototype 
> for 'amdgpu_job_free_cb' [-Wmissing-prototypes]
> 
> 
> In fact, these functions are only used in the file in which they are
> declared and don't need a declaration, but can be made static.
> So this patch marks these functions with 'static'.
> 
> Signed-off-by: Baoyou Xie <baoyou@linaro.org>

Thanks.

Acked-by: Huang Rui <ray.hu...@amd.com>

Re: [PATCH] drm: amdgpu: mark symbols static where possible

2016-09-03 Thread Huang Rui

On Sat, Sep 03, 2016 at 01:57:14PM +0800, Baoyou Xie wrote:
> We get a few warnings when building kernel with W=1:
> drivers/gpu/drm/amd/amdgpu/cz_smc.c:51:5: warning: no previous prototype for 
> 'cz_send_msg_to_smc_async' [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/cz_smc.c:143:5: warning: no previous prototype for 
> 'cz_write_smc_sram_dword' [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/iceland_smc.c:124:6: warning: no previous 
> prototype for 'iceland_start_smc' [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:3926:6: warning: no previous prototype 
> for 'gfx_v8_0_rlc_stop' [-Wmissing-prototypes]
> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:94:6: warning: no previous prototype 
> for 'amdgpu_job_free_cb' [-Wmissing-prototypes]
> 
> 
> In fact, these functions are only used in the file in which they are
> declared and don't need a declaration, but can be made static.
> So this patch marks these functions with 'static'.
> 
> Signed-off-by: Baoyou Xie 

Thanks.

Acked-by: Huang Rui

Re: perf: fuzzer crashes immediately on AMD system

2016-08-22 Thread Huang Rui

Hi Peter, Vince

On Fri, Aug 19, 2016 at 12:01:30PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> > On Thu, 18 Aug 2016, Vince Weaver wrote:
> > 
> > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> > > falls over more or less immediately.
> > > 
> > > This maps to variable_test_bit()
> > >   called by ctx = find_get_context(pmu, task, event);
> > >   in kernel/events/core.c:9467
> > > 
> > > It happens quickly enough I can probably track down the exact event that 
> > > causes this, if needed.
> > 
> > I have a one line reproducer:
> > 
> > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> 
> OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
> various manuals to see if I can spot the fail.
> 
> Huang could you either prod someone at AMD or do yourself, audit the AMD
> perf code for all the various new models?

Actually, there might be some NBPMC event changes between model 0h-fh and
model 10h-1fh. Below are the documents of these two processors:

http://support.amd.com/TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf
http://support.amd.com/TechDocs/42300_15h_Mod_10h-1Fh_BKDG.pdf

In section 3.16, it describes usage of NB Performance Counter Events.

Hope it helps. :-)

Thanks,
Rui

Re: perf: fuzzer crashes immediately on AMD system

2016-08-22 Thread Huang Rui

Hi Peter, Vince

On Fri, Aug 19, 2016 at 12:01:30PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> > On Thu, 18 Aug 2016, Vince Weaver wrote:
> > 
> > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> > > falls over more or less immediately.
> > > 
> > > This maps to variable_test_bit()
> > >   called by ctx = find_get_context(pmu, task, event);
> > >   in kernel/events/core.c:9467
> > > 
> > > It happens quickly enough I can probably track down the exact event that 
> > > causes this, if needed.
> > 
> > I have a one line reproducer:
> > 
> > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> 
> OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
> various manuals to see if I can spot the fail.
> 
> Huang could you either prod someone at AMD or do yourself, audit the AMD
> perf code for all the various new models?

Actually, there might be some NBPMC event changes between model 0h-fh and
model 10h-1fh. Below are the documents of these two processors:

http://support.amd.com/TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf
http://support.amd.com/TechDocs/42300_15h_Mod_10h-1Fh_BKDG.pdf

In section 3.16, it describes usage of NB Performance Counter Events.

Hope it helps. :-)

Thanks,
Rui

Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism

2016-07-19 Thread Huang Rui

On Tue, Jul 19, 2016 at 02:22:36PM -0400, Vince Weaver wrote:
> On Fri, 17 Jun 2016, Huang Rui wrote:
> 
> > On Thu, Jun 16, 2016 at 06:47:00PM +0200, Borislav Petkov wrote:
> > > On Thu, Jun 16, 2016 at 01:38:14PM +0800, Huang Rui wrote:
> > > > I was told this feature would be supported on fam15h 60h, 70h and
> > > > later processors before. Just checked the fam16h model 30h BKDG, yes,
> > > > it should be also supported. But I didn't test that platform, if you
> > > > confirm it works in your side. We can enable it.
> > > 
> > > You might want to ask around first whether F16M30's acc power machinery
> > > is even usable? I.e., no errata and whatnot...
> > > 
> > 
> > Yep, I already asked the designers, and was waiting for their
> > feedbacks.
> 
> Was there ever any feedback about any of the problems encountered with AMD 
> APM support?
> 

Vince, apology to late response. We are drafting the erratum for this
feature. Will let you know if it is public.

Thanks,
Rui

Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism

2016-07-19 Thread Huang Rui

On Tue, Jul 19, 2016 at 02:22:36PM -0400, Vince Weaver wrote:
> On Fri, 17 Jun 2016, Huang Rui wrote:
> 
> > On Thu, Jun 16, 2016 at 06:47:00PM +0200, Borislav Petkov wrote:
> > > On Thu, Jun 16, 2016 at 01:38:14PM +0800, Huang Rui wrote:
> > > > I was told this feature would be supported on fam15h 60h, 70h and
> > > > later processors before. Just checked the fam16h model 30h BKDG, yes,
> > > > it should be also supported. But I didn't test that platform, if you
> > > > confirm it works in your side. We can enable it.
> > > 
> > > You might want to ask around first whether F16M30's acc power machinery
> > > is even usable? I.e., no errata and whatnot...
> > > 
> > 
> > Yep, I already asked the designers, and was waiting for their
> > feedbacks.
> 
> Was there ever any feedback about any of the problems encountered with AMD 
> APM support?
> 

Vince, apology to late response. We are drafting the erratum for this
feature. Will let you know if it is public.

Thanks,
Rui

Re: [patch] events/amd/power add support for fam16h model30h

2016-06-17 Thread Huang Rui

On Thu, Jun 16, 2016 at 11:12:18PM +0200, Borislav Petkov wrote:
> On Thu, Jun 16, 2016 at 05:00:04PM -0400, Vince Weaver wrote:
> > 
> > According to the BKDG the AMD Family16h Model30h "Jaguar Mullins"
> > also supports the accumulated power interface.  I've tested on
> > hardware I have and with this patch I indeed get power readings using 
> > perf.
> > 
> > Signed-off-by: Vince Weaver 
> > 
> > diff --git a/arch/x86/events/amd/power.c b/arch/x86/events/amd/power.c
> > index 55a3529..778b77d 100644
> > --- a/arch/x86/events/amd/power.c
> > +++ b/arch/x86/events/amd/power.c
> > @@ -292,6 +292,7 @@ static struct notifier_block power_cpu_notifier_nb = {
> >  
> >  static const struct x86_cpu_id cpu_match[] = {
> > { .vendor = X86_VENDOR_AMD, .family = 0x15 },
> > +   { .vendor = X86_VENDOR_AMD, .family = 0x16 },
> > {},
> 
> Actually, I think we remove that table completely and rely solely on the
> CPUID bit:
> 
> if (!boot_cpu_has(X86_FEATURE_ACC_POWER))
> return -ENODEV;
> 
> Rui?
> 

Agree with you. If the some chips are not stable, we can add a check
to ignore them with family and model id.

Thanks,
Rui

Re: [patch] events/amd/power add support for fam16h model30h

2016-06-17 Thread Huang Rui

On Thu, Jun 16, 2016 at 11:12:18PM +0200, Borislav Petkov wrote:
> On Thu, Jun 16, 2016 at 05:00:04PM -0400, Vince Weaver wrote:
> > 
> > According to the BKDG the AMD Family16h Model30h "Jaguar Mullins"
> > also supports the accumulated power interface.  I've tested on
> > hardware I have and with this patch I indeed get power readings using 
> > perf.
> > 
> > Signed-off-by: Vince Weaver 
> > 
> > diff --git a/arch/x86/events/amd/power.c b/arch/x86/events/amd/power.c
> > index 55a3529..778b77d 100644
> > --- a/arch/x86/events/amd/power.c
> > +++ b/arch/x86/events/amd/power.c
> > @@ -292,6 +292,7 @@ static struct notifier_block power_cpu_notifier_nb = {
> >  
> >  static const struct x86_cpu_id cpu_match[] = {
> > { .vendor = X86_VENDOR_AMD, .family = 0x15 },
> > +   { .vendor = X86_VENDOR_AMD, .family = 0x16 },
> > {},
> 
> Actually, I think we remove that table completely and rely solely on the
> CPUID bit:
> 
> if (!boot_cpu_has(X86_FEATURE_ACC_POWER))
> return -ENODEV;
> 
> Rui?
> 

Agree with you. If the some chips are not stable, we can add a check
to ignore them with family and model id.

Thanks,
Rui

Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism

2016-06-17 Thread Huang Rui

On Thu, Jun 16, 2016 at 04:44:20PM -0400, Vince Weaver wrote:
> On Thu, 16 Jun 2016, Huang Rui wrote:
> 
> > > 1.  In theory this should also work on an amd fam16h model 30h
> > > processor too, correct?  The current code limits things to fam15h
> > > even though the fam16mod30h has all the proper cpuid flags.
> > > 
> > 
> > I was told this feature would be supported on fam15h 60h, 70h and
> > later processors before. Just checked the fam16h model 30h BKDG, yes,
> > it should be also supported. But I didn't test that platform, if you
> > confirm it works in your side. We can enable it.
> 
> I can confirm I get power readings on my fam16hmod30h board once I apply a 
> trivial patch to the driver.  I'll send the patch in a separate e-mail.
> 

OK, thanks.

> > PTSC's frequency is about 100Mhz, it shouldn't be overflow.
> 
> That's what I thought.  I'm trying to read the value using the /dev/msr 
> interface from userspace and I get weird results.
> 
> i.e.:
>   Jx: read 62d299b84
>   PTSC MSR: read 72fe92
>   
>   sleep 5ms
> 
>   Jy: read 631b453b9
>   PTSC MSR: read 46b25
> 
> this happens about half the time (PTSC going backwards).  Though 
> admittedly the problem could somehow be in the MSR code I'm using.
> 

Can you try to read the MSR value two times with the same core
(rdmsrl_on_cpu)?

Thanks,
Rui

Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism

2016-06-17 Thread Huang Rui

On Thu, Jun 16, 2016 at 04:44:20PM -0400, Vince Weaver wrote:
> On Thu, 16 Jun 2016, Huang Rui wrote:
> 
> > > 1.  In theory this should also work on an amd fam16h model 30h
> > > processor too, correct?  The current code limits things to fam15h
> > > even though the fam16mod30h has all the proper cpuid flags.
> > > 
> > 
> > I was told this feature would be supported on fam15h 60h, 70h and
> > later processors before. Just checked the fam16h model 30h BKDG, yes,
> > it should be also supported. But I didn't test that platform, if you
> > confirm it works in your side. We can enable it.
> 
> I can confirm I get power readings on my fam16hmod30h board once I apply a 
> trivial patch to the driver.  I'll send the patch in a separate e-mail.
> 

OK, thanks.

> > PTSC's frequency is about 100Mhz, it shouldn't be overflow.
> 
> That's what I thought.  I'm trying to read the value using the /dev/msr 
> interface from userspace and I get weird results.
> 
> i.e.:
>   Jx: read 62d299b84
>   PTSC MSR: read 72fe92
>   
>   sleep 5ms
> 
>   Jy: read 631b453b9
>   PTSC MSR: read 46b25
> 
> this happens about half the time (PTSC going backwards).  Though 
> admittedly the problem could somehow be in the MSR code I'm using.
> 

Can you try to read the MSR value two times with the same core
(rdmsrl_on_cpu)?

Thanks,
Rui

Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism

2016-06-17 Thread Huang Rui

On Thu, Jun 16, 2016 at 06:47:00PM +0200, Borislav Petkov wrote:
> On Thu, Jun 16, 2016 at 01:38:14PM +0800, Huang Rui wrote:
> > I was told this feature would be supported on fam15h 60h, 70h and
> > later processors before. Just checked the fam16h model 30h BKDG, yes,
> > it should be also supported. But I didn't test that platform, if you
> > confirm it works in your side. We can enable it.
> 
> You might want to ask around first whether F16M30's acc power machinery
> is even usable? I.e., no errata and whatnot...
> 

Yep, I already asked the designers, and was waiting for their
feedbacks.

Thanks,
Rui

Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism

2016-06-17 Thread Huang Rui

On Thu, Jun 16, 2016 at 06:47:00PM +0200, Borislav Petkov wrote:
> On Thu, Jun 16, 2016 at 01:38:14PM +0800, Huang Rui wrote:
> > I was told this feature would be supported on fam15h 60h, 70h and
> > later processors before. Just checked the fam16h model 30h BKDG, yes,
> > it should be also supported. But I didn't test that platform, if you
> > confirm it works in your side. We can enable it.
> 
> You might want to ask around first whether F16M30's acc power machinery
> is even usable? I.e., no errata and whatnot...
> 

Yep, I already asked the designers, and was waiting for their
feedbacks.

Thanks,
Rui

Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism

2016-06-16 Thread Huang Rui

Hi Vince,

Thanks for asking.

On Wed, Jun 15, 2016 at 09:13:59PM -0400, Vince Weaver wrote:
> 
> three questions about this functionality:
> 
> 1.  In theory this should also work on an amd fam16h model 30h
> processor too, correct?  The current code limits things to fam15h
> even though the fam16mod30h has all the proper cpuid flags.
> 

I was told this feature would be supported on fam15h 60h, 70h and
later processors before. Just checked the fam16h model 30h BKDG, yes,
it should be also supported. But I didn't test that platform, if you
confirm it works in your side. We can enable it.

> I've tested the functionality a bit and it seems to work but for
> some reason the ptsc seems to occasionally count backwards
> on my machine.  Any reason that would be?  (It doesn't seem to be
> an overflow, just reading the ptsc 5ms apart and the values are 
> slightly lower after than before).
> 

PTSC's frequency is about 100Mhz, it shouldn't be overflow.

> 2.  Unless I'm misunderstanding things, the code seems to be accumulating 
>   Power. (see chunk below) Power is an instantaneous measurement, it 
>   makes no sense to add values.  If you use 5W for 1ms and 10W for
>   1ms, the average power across the 2ms interval is not 15W.
> 
>   You can add energy, but not power.
> 
> > +   delta *= cpu_pwr_sample_ratio * 1000;
> > +   tdelta = new_ptsc - prev_ptsc;
> > +
> > +   do_div(delta, tdelta);
> > +   local64_add(delta, >count);
> 

You're right. Nice catch! The average power is per compute unit. We
cannot add the power simplely for each processor/package.

So here, the average power per package should be (delta1 + delta2 + ... + 
deltaN)/(tdelta_avg).
I will work out a fix. Thanks to point out.

> 3.  The actual results gathered seem rediculously low.  341 seconds of
> calculation and only using 183 mWatts of power?
> 

mWatts are for processor power not system power. Below data is
calculated on fam15h model 60h which is low power platform. Even
though the method has a minor mistake, the processor power should be
in mWatts field.

> >Performance counter stats for 'system wide':
> > 
> >   183.44 mWatts power/power-pkg/
> > 
> >341.837270111 seconds time elapsed
> > 
> >   root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' 
> > sleep 10
> 

Thanks,
Rui

Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism

2016-06-16 Thread Huang Rui

Hi Vince,

Thanks for asking.

On Wed, Jun 15, 2016 at 09:13:59PM -0400, Vince Weaver wrote:
> 
> three questions about this functionality:
> 
> 1.  In theory this should also work on an amd fam16h model 30h
> processor too, correct?  The current code limits things to fam15h
> even though the fam16mod30h has all the proper cpuid flags.
> 

I was told this feature would be supported on fam15h 60h, 70h and
later processors before. Just checked the fam16h model 30h BKDG, yes,
it should be also supported. But I didn't test that platform, if you
confirm it works in your side. We can enable it.

> I've tested the functionality a bit and it seems to work but for
> some reason the ptsc seems to occasionally count backwards
> on my machine.  Any reason that would be?  (It doesn't seem to be
> an overflow, just reading the ptsc 5ms apart and the values are 
> slightly lower after than before).
> 

PTSC's frequency is about 100Mhz, it shouldn't be overflow.

> 2.  Unless I'm misunderstanding things, the code seems to be accumulating 
>   Power. (see chunk below) Power is an instantaneous measurement, it 
>   makes no sense to add values.  If you use 5W for 1ms and 10W for
>   1ms, the average power across the 2ms interval is not 15W.
> 
>   You can add energy, but not power.
> 
> > +   delta *= cpu_pwr_sample_ratio * 1000;
> > +   tdelta = new_ptsc - prev_ptsc;
> > +
> > +   do_div(delta, tdelta);
> > +   local64_add(delta, >count);
> 

You're right. Nice catch! The average power is per compute unit. We
cannot add the power simplely for each processor/package.

So here, the average power per package should be (delta1 + delta2 + ... + 
deltaN)/(tdelta_avg).
I will work out a fix. Thanks to point out.

> 3.  The actual results gathered seem rediculously low.  341 seconds of
> calculation and only using 183 mWatts of power?
> 

mWatts are for processor power not system power. Below data is
calculated on fam15h model 60h which is low power platform. Even
though the method has a minor mistake, the processor power should be
in mWatts field.

> >Performance counter stats for 'system wide':
> > 
> >   183.44 mWatts power/power-pkg/
> > 
> >341.837270111 seconds time elapsed
> > 
> >   root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' 
> > sleep 10
> 

Thanks,
Rui

Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism

2016-06-15 Thread Huang Rui

On Thu, Jun 16, 2016 at 01:38:13PM +0800, Huang Rui wrote:
> On Wed, Jun 15, 2016 at 09:13:59PM -0400, Vince Weaver wrote:
> > 
> > 2.  Unless I'm misunderstanding things, the code seems to be accumulating 
> > Power. (see chunk below) Power is an instantaneous measurement, it 
> > makes no sense to add values.  If you use 5W for 1ms and 10W for
> > 1ms, the average power across the 2ms interval is not 15W.
> > 
> > You can add energy, but not power.
> > 
> > > + delta *= cpu_pwr_sample_ratio * 1000;
> > > + tdelta = new_ptsc - prev_ptsc;
> > > +
> > > + do_div(delta, tdelta);
> > > + local64_add(delta, >count);
> > 
> 
> You're right. Nice catch! The average power is per compute unit. We
> cannot add the power simplely for each processor/package.
> 
> So here, the average power per package should be (delta1 + delta2 + ... + 
> deltaN)/(tdelta_avg).
> I will work out a fix. Thanks to point out.
> 

After considering carefully, the original method should be OK. 

  AMD nomenclature for CMT systems:

[node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
 -> [Compute Unit Core 1] -> Linux CPU 1
 -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
 -> [Compute Unit Core 1] -> Linux CPU 3

The deltaN is power per compute unit. Current one package has two CUs.
In the *same* interval, CU0's power is 10W, CU1's power is 15W. The
package (CU0 + CU1) power should be 15W, right? Because the interval
is the same.

Q = Q1 + Q2.  P = Q/t = (Q1 + Q2)/t = Q1/t + Q2/t = P1 + P2.

Is that clear?

Thanks,
Rui

Re: [REDO PATCH v7] perf/x86/amd/power: Add AMD accumulated power reporting mechanism

2016-06-15 Thread Huang Rui

On Thu, Jun 16, 2016 at 01:38:13PM +0800, Huang Rui wrote:
> On Wed, Jun 15, 2016 at 09:13:59PM -0400, Vince Weaver wrote:
> > 
> > 2.  Unless I'm misunderstanding things, the code seems to be accumulating 
> > Power. (see chunk below) Power is an instantaneous measurement, it 
> > makes no sense to add values.  If you use 5W for 1ms and 10W for
> > 1ms, the average power across the 2ms interval is not 15W.
> > 
> > You can add energy, but not power.
> > 
> > > + delta *= cpu_pwr_sample_ratio * 1000;
> > > + tdelta = new_ptsc - prev_ptsc;
> > > +
> > > + do_div(delta, tdelta);
> > > + local64_add(delta, >count);
> > 
> 
> You're right. Nice catch! The average power is per compute unit. We
> cannot add the power simplely for each processor/package.
> 
> So here, the average power per package should be (delta1 + delta2 + ... + 
> deltaN)/(tdelta_avg).
> I will work out a fix. Thanks to point out.
> 

After considering carefully, the original method should be OK. 

  AMD nomenclature for CMT systems:

[node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
 -> [Compute Unit Core 1] -> Linux CPU 1
 -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
 -> [Compute Unit Core 1] -> Linux CPU 3

The deltaN is power per compute unit. Current one package has two CUs.
In the *same* interval, CU0's power is 10W, CU1's power is 15W. The
package (CU0 + CU1) power should be 15W, right? Because the interval
is the same.

Q = Q1 + Q2.  P = Q/t = (Q1 + Q2)/t = Q1/t + Q2/t = P1 + P2.

Is that clear?

Thanks,
Rui

Re: [PATCH] hwmon: (fam15h_power) Disable preemption when reading registers

2016-06-06 Thread Huang Rui

On Sat, Jun 04, 2016 at 07:55:08AM -0700, Guenter Roeck wrote:
> On 06/03/2016 11:29 AM, Borislav Petkov wrote:
> >On Fri, Jun 03, 2016 at 11:18:36AM -0700, Guenter Roeck wrote:
> >>I like this version. Applied, though it would be nice to get a
> >>Tested-by: or Acked-by: from someone at AMD. I'll hold back until
> >>early next week before sending it to Linus.
> >
> >If it makes you feel any better: I have the affected hardware and am
> >testing on it too. I wouldnt've caught it otherwise.
> >
> 
> I'll send it to Linus next week unless I get an objection.
> 

Guenter, I already verified on AMD CZ(family 15h, model 60h) and model
70h processors.

Acked-and-tested-by: Huang Rui <ray.hu...@amd.com>

Re: [PATCH] hwmon: (fam15h_power) Disable preemption when reading registers

2016-06-06 Thread Huang Rui

On Sat, Jun 04, 2016 at 07:55:08AM -0700, Guenter Roeck wrote:
> On 06/03/2016 11:29 AM, Borislav Petkov wrote:
> >On Fri, Jun 03, 2016 at 11:18:36AM -0700, Guenter Roeck wrote:
> >>I like this version. Applied, though it would be nice to get a
> >>Tested-by: or Acked-by: from someone at AMD. I'll hold back until
> >>early next week before sending it to Linus.
> >
> >If it makes you feel any better: I have the affected hardware and am
> >testing on it too. I wouldnt've caught it otherwise.
> >
> 
> I'll send it to Linus next week unless I get an objection.
> 

Guenter, I already verified on AMD CZ(family 15h, model 60h) and model
70h processors.

Acked-and-tested-by: Huang Rui

Re: [PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power

2016-04-06 Thread Huang Rui

On Wed, Apr 06, 2016 at 08:30:25AM -0700, Guenter Roeck wrote:
> On Wed, Apr 06, 2016 at 03:44:11PM +0800, Huang Rui wrote:
> >  
> > +static void do_read_registers_on_cu(void *_data)
> > +{
> > +   struct fam15h_power_data *data = _data;
> > +   int cpu, cu;
> > +
> > +   cpu = smp_processor_id();
> > +
> 
> Is this function now defined in non-SMP code ? If so, can you point me to the
> patch or branch introducing it ? It doesn't seem to be in mainline or in -next
> unless I am missing it.
> 

In include/linux/smp.h

#else /* !SMP */

static inline void smp_send_stop(void) { }

/*
 *  These macros fold the SMP functionality into a single CPU system
 */
#define raw_smp_processor_id()  0

...

/*
 * smp_processor_id(): get the current CPU ID.
 *
 * if DEBUG_PREEMPT is enabled then we check whether it is
 * used in a preemption-safe way. (smp_processor_id() is safe
 * if it's used in a preemption-off critical section, or in
 * a thread that is bound to the current CPU.)
 *
 * NOTE: raw_smp_processor_id() is for internal use only
 * (smp_processor_id() is the preferred variant), but in rare
 * instances it might also be used to turn off false positives
 * (i.e. smp_processor_id() use that the debugging code reports but
 * which use for some reason is legal). Don't use this to hack around
 * the warning message, as your code might not work under PREEMPT.
 */
#ifdef CONFIG_DEBUG_PREEMPT
  extern unsigned int debug_smp_processor_id(void);
# define smp_processor_id() debug_smp_processor_id()
#else
# define smp_processor_id() raw_smp_processor_id()
#endif

Actually smp_processor_id() should returns 0 if we disable CONFIG_SMP.

> > +   /*
> > +* With the new x86 topology modelling, cpu core id actually
> > +* is compute unit id.
> > +*/
> > +   cu = cpu_data(cpu).cpu_core_id;
> > +
> > +   rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
> > +}
> > +
> > +/*
> > + * This function is only able to be called when CPUID
> > + * Fn8000_0007:EDX[12] is set.
> > + */
> > +static int read_registers(struct fam15h_power_data *data)
> > +{
> > +   int this_cpu, ret, cpu;
> > +   int core, this_core;
> > +   cpumask_var_t mask;
> > +
> > +   ret = zalloc_cpumask_var(, GFP_KERNEL);
> > +   if (!ret)
> > +   return -ENOMEM;
> > +
> > +   get_online_cpus();
> > +   this_cpu = smp_processor_id();
> > +
> > +   /*
> > +* Choose the first online core of each compute unit, and then
> > +* read their MSR value of power and ptsc in a single IPI,
> > +* because the MSR value of CPU core represent the compute
> > +* unit's.
> > +*/
> > +   core = -1;
> > +
> > +   for_each_online_cpu(cpu) {
> > +   this_core = topology_core_id(cpu);
> > +
> > +   if (this_core == core)
> > +   continue;
> > +
> > +   core = this_core;
> > +
> Sorry if I missed some context - is it guaranteed that all cores in the same
> compute unit are returned next to each other from for_each_online_cpu() ?
> 

Yes, there is a documentation which introduced from v4.6-rc2:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f7be8610bca88e59dd2fd5d98fcbc5031ef0e079

   - topology_core_id();

The ID of the core to which a thread belongs. It is also printed in 
/proc/cpuinfo
"core_id."

...

  AMD nomenclature for CMT systems:

[node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
 -> [Compute Unit Core 1] -> Linux CPU 1
 -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
 -> [Compute Unit Core 1] -> Linux CPU 3

ray@hr-ub:~/tip$ cat /proc/cpuinfo | grep "core id"
core id : 0
core id : 0
core id : 1
core id : 1

"this_core" here actually means the [Compute Unit] id which current
[Compute Unit Core] belongs to. And "cpu" here means the [Compute Unit Core].

Thanks,
Rui

Re: [PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power

2016-04-06 Thread Huang Rui

On Wed, Apr 06, 2016 at 08:30:25AM -0700, Guenter Roeck wrote:
> On Wed, Apr 06, 2016 at 03:44:11PM +0800, Huang Rui wrote:
> >  
> > +static void do_read_registers_on_cu(void *_data)
> > +{
> > +   struct fam15h_power_data *data = _data;
> > +   int cpu, cu;
> > +
> > +   cpu = smp_processor_id();
> > +
> 
> Is this function now defined in non-SMP code ? If so, can you point me to the
> patch or branch introducing it ? It doesn't seem to be in mainline or in -next
> unless I am missing it.
> 

In include/linux/smp.h

#else /* !SMP */

static inline void smp_send_stop(void) { }

/*
 *  These macros fold the SMP functionality into a single CPU system
 */
#define raw_smp_processor_id()  0

...

/*
 * smp_processor_id(): get the current CPU ID.
 *
 * if DEBUG_PREEMPT is enabled then we check whether it is
 * used in a preemption-safe way. (smp_processor_id() is safe
 * if it's used in a preemption-off critical section, or in
 * a thread that is bound to the current CPU.)
 *
 * NOTE: raw_smp_processor_id() is for internal use only
 * (smp_processor_id() is the preferred variant), but in rare
 * instances it might also be used to turn off false positives
 * (i.e. smp_processor_id() use that the debugging code reports but
 * which use for some reason is legal). Don't use this to hack around
 * the warning message, as your code might not work under PREEMPT.
 */
#ifdef CONFIG_DEBUG_PREEMPT
  extern unsigned int debug_smp_processor_id(void);
# define smp_processor_id() debug_smp_processor_id()
#else
# define smp_processor_id() raw_smp_processor_id()
#endif

Actually smp_processor_id() should returns 0 if we disable CONFIG_SMP.

> > +   /*
> > +* With the new x86 topology modelling, cpu core id actually
> > +* is compute unit id.
> > +*/
> > +   cu = cpu_data(cpu).cpu_core_id;
> > +
> > +   rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
> > +}
> > +
> > +/*
> > + * This function is only able to be called when CPUID
> > + * Fn8000_0007:EDX[12] is set.
> > + */
> > +static int read_registers(struct fam15h_power_data *data)
> > +{
> > +   int this_cpu, ret, cpu;
> > +   int core, this_core;
> > +   cpumask_var_t mask;
> > +
> > +   ret = zalloc_cpumask_var(, GFP_KERNEL);
> > +   if (!ret)
> > +   return -ENOMEM;
> > +
> > +   get_online_cpus();
> > +   this_cpu = smp_processor_id();
> > +
> > +   /*
> > +* Choose the first online core of each compute unit, and then
> > +* read their MSR value of power and ptsc in a single IPI,
> > +* because the MSR value of CPU core represent the compute
> > +* unit's.
> > +*/
> > +   core = -1;
> > +
> > +   for_each_online_cpu(cpu) {
> > +   this_core = topology_core_id(cpu);
> > +
> > +   if (this_core == core)
> > +   continue;
> > +
> > +   core = this_core;
> > +
> Sorry if I missed some context - is it guaranteed that all cores in the same
> compute unit are returned next to each other from for_each_online_cpu() ?
> 

Yes, there is a documentation which introduced from v4.6-rc2:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f7be8610bca88e59dd2fd5d98fcbc5031ef0e079

   - topology_core_id();

The ID of the core to which a thread belongs. It is also printed in 
/proc/cpuinfo
"core_id."

...

  AMD nomenclature for CMT systems:

[node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
 -> [Compute Unit Core 1] -> Linux CPU 1
 -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
 -> [Compute Unit Core 1] -> Linux CPU 3

ray@hr-ub:~/tip$ cat /proc/cpuinfo | grep "core id"
core id : 0
core id : 0
core id : 1
core id : 1

"this_core" here actually means the [Compute Unit] id which current
[Compute Unit Core] belongs to. And "cpu" here means the [Compute Unit Core].

Thanks,
Rui

[PATCH v6 3/6] hwmon: (fam15h_power) Add ptsc counter value for accumulated power

2016-04-06 Thread Huang Rui

PTSC is the performance timestamp counter value in a cpu core and the
cores in one compute unit have the fixed frequency. So it picks up the
performance timestamp counter value of the first core per compute unit
to measure the interval for average power per compute unit.

Signed-off-by: Huang Rui <ray.hu...@amd.com>
Cc: Borislav Petkov <b...@alien8.de>
---
 drivers/hwmon/fam15h_power.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 4edbaf0..336d422 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -50,6 +50,7 @@ MODULE_LICENSE("GPL");
 
 #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
 #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b
+#define MSR_F15H_PTSC  0xc0010280
 
 #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4
 
@@ -65,6 +66,8 @@ struct fam15h_power_data {
u64 max_cu_acc_power;
/* accumulated power of the compute units */
u64 cu_acc_power[MAX_CUS];
+   /* performance timestamp counter */
+   u64 cpu_sw_pwr_ptsc[MAX_CUS];
 };
 
 static ssize_t show_power(struct device *dev,
@@ -145,6 +148,7 @@ static void do_read_registers_on_cu(void *_data)
cu = cpu_data(cpu).cpu_core_id;
 
rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
+   rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]);
 }
 
 /*
-- 
1.9.1

[PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power

2016-04-06 Thread Huang Rui

This patch adds a member in fam15h_power_data which specifies the
compute unit accumulated power. It adds do_read_registers_on_cu to do
all the read to all MSRs and run it on one of the online cores on each
compute unit with smp_call_function_many(). This behavior can decrease
IPI numbers.

Suggested-by: Borislav Petkov <b...@alien8.de>
Signed-off-by: Huang Rui <ray.hu...@amd.com>
---
 drivers/hwmon/fam15h_power.c | 72 +++-
 1 file changed, 71 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 4f695d8..4edbaf0 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -25,6 +25,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
@@ -44,7 +46,9 @@ MODULE_LICENSE("GPL");
 
 #define FAM15H_MIN_NUM_ATTRS   2
 #define FAM15H_NUM_GROUPS  2
+#define MAX_CUS8
 
+#define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
 #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b
 
 #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4
@@ -59,6 +63,8 @@ struct fam15h_power_data {
struct attribute_group group;
/* maximum accumulated power of a compute unit */
u64 max_cu_acc_power;
+   /* accumulated power of the compute units */
+   u64 cu_acc_power[MAX_CUS];
 };
 
 static ssize_t show_power(struct device *dev,
@@ -125,6 +131,70 @@ static ssize_t show_power_crit(struct device *dev,
 }
 static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL);
 
+static void do_read_registers_on_cu(void *_data)
+{
+   struct fam15h_power_data *data = _data;
+   int cpu, cu;
+
+   cpu = smp_processor_id();
+
+   /*
+* With the new x86 topology modelling, cpu core id actually
+* is compute unit id.
+*/
+   cu = cpu_data(cpu).cpu_core_id;
+
+   rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
+}
+
+/*
+ * This function is only able to be called when CPUID
+ * Fn8000_0007:EDX[12] is set.
+ */
+static int read_registers(struct fam15h_power_data *data)
+{
+   int this_cpu, ret, cpu;
+   int core, this_core;
+   cpumask_var_t mask;
+
+   ret = zalloc_cpumask_var(, GFP_KERNEL);
+   if (!ret)
+   return -ENOMEM;
+
+   get_online_cpus();
+   this_cpu = smp_processor_id();
+
+   /*
+* Choose the first online core of each compute unit, and then
+* read their MSR value of power and ptsc in a single IPI,
+* because the MSR value of CPU core represent the compute
+* unit's.
+*/
+   core = -1;
+
+   for_each_online_cpu(cpu) {
+   this_core = topology_core_id(cpu);
+
+   if (this_core == core)
+   continue;
+
+   core = this_core;
+
+   /* get any CPU on this compute unit */
+   cpumask_set_cpu(cpumask_any(topology_sibling_cpumask(cpu)), 
mask);
+   }
+
+   if (cpumask_test_cpu(this_cpu, mask))
+   do_read_registers_on_cu(data);
+
+   smp_call_function_many(mask, do_read_registers_on_cu, data, true);
+   put_online_cpus();
+
+   free_cpumask_var(mask);
+
+   return 0;
+}
+
 static int fam15h_power_init_attrs(struct pci_dev *pdev,
   struct fam15h_power_data *data)
 {
@@ -263,7 +333,7 @@ static int fam15h_power_init_data(struct pci_dev *f4,
 
data->max_cu_acc_power = tmp;
 
-   return 0;
+   return read_registers(data);
 }
 
 static int fam15h_power_probe(struct pci_dev *pdev,
-- 
1.9.1

[PATCH v6 3/6] hwmon: (fam15h_power) Add ptsc counter value for accumulated power

2016-04-06 Thread Huang Rui

PTSC is the performance timestamp counter value in a cpu core and the
cores in one compute unit have the fixed frequency. So it picks up the
performance timestamp counter value of the first core per compute unit
to measure the interval for average power per compute unit.

Signed-off-by: Huang Rui 
Cc: Borislav Petkov 
---
 drivers/hwmon/fam15h_power.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 4edbaf0..336d422 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -50,6 +50,7 @@ MODULE_LICENSE("GPL");
 
 #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
 #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b
+#define MSR_F15H_PTSC  0xc0010280
 
 #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4
 
@@ -65,6 +66,8 @@ struct fam15h_power_data {
u64 max_cu_acc_power;
/* accumulated power of the compute units */
u64 cu_acc_power[MAX_CUS];
+   /* performance timestamp counter */
+   u64 cpu_sw_pwr_ptsc[MAX_CUS];
 };
 
 static ssize_t show_power(struct device *dev,
@@ -145,6 +148,7 @@ static void do_read_registers_on_cu(void *_data)
cu = cpu_data(cpu).cpu_core_id;
 
rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
+   rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]);
 }
 
 /*
-- 
1.9.1

[PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power

2016-04-06 Thread Huang Rui

This patch adds a member in fam15h_power_data which specifies the
compute unit accumulated power. It adds do_read_registers_on_cu to do
all the read to all MSRs and run it on one of the online cores on each
compute unit with smp_call_function_many(). This behavior can decrease
IPI numbers.

Suggested-by: Borislav Petkov 
Signed-off-by: Huang Rui 
---
 drivers/hwmon/fam15h_power.c | 72 +++-
 1 file changed, 71 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 4f695d8..4edbaf0 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -25,6 +25,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
@@ -44,7 +46,9 @@ MODULE_LICENSE("GPL");
 
 #define FAM15H_MIN_NUM_ATTRS   2
 #define FAM15H_NUM_GROUPS  2
+#define MAX_CUS8
 
+#define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
 #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b
 
 #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4
@@ -59,6 +63,8 @@ struct fam15h_power_data {
struct attribute_group group;
/* maximum accumulated power of a compute unit */
u64 max_cu_acc_power;
+   /* accumulated power of the compute units */
+   u64 cu_acc_power[MAX_CUS];
 };
 
 static ssize_t show_power(struct device *dev,
@@ -125,6 +131,70 @@ static ssize_t show_power_crit(struct device *dev,
 }
 static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL);
 
+static void do_read_registers_on_cu(void *_data)
+{
+   struct fam15h_power_data *data = _data;
+   int cpu, cu;
+
+   cpu = smp_processor_id();
+
+   /*
+* With the new x86 topology modelling, cpu core id actually
+* is compute unit id.
+*/
+   cu = cpu_data(cpu).cpu_core_id;
+
+   rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
+}
+
+/*
+ * This function is only able to be called when CPUID
+ * Fn8000_0007:EDX[12] is set.
+ */
+static int read_registers(struct fam15h_power_data *data)
+{
+   int this_cpu, ret, cpu;
+   int core, this_core;
+   cpumask_var_t mask;
+
+   ret = zalloc_cpumask_var(, GFP_KERNEL);
+   if (!ret)
+   return -ENOMEM;
+
+   get_online_cpus();
+   this_cpu = smp_processor_id();
+
+   /*
+* Choose the first online core of each compute unit, and then
+* read their MSR value of power and ptsc in a single IPI,
+* because the MSR value of CPU core represent the compute
+* unit's.
+*/
+   core = -1;
+
+   for_each_online_cpu(cpu) {
+   this_core = topology_core_id(cpu);
+
+   if (this_core == core)
+   continue;
+
+   core = this_core;
+
+   /* get any CPU on this compute unit */
+   cpumask_set_cpu(cpumask_any(topology_sibling_cpumask(cpu)), 
mask);
+   }
+
+   if (cpumask_test_cpu(this_cpu, mask))
+   do_read_registers_on_cu(data);
+
+   smp_call_function_many(mask, do_read_registers_on_cu, data, true);
+   put_online_cpus();
+
+   free_cpumask_var(mask);
+
+   return 0;
+}
+
 static int fam15h_power_init_attrs(struct pci_dev *pdev,
   struct fam15h_power_data *data)
 {
@@ -263,7 +333,7 @@ static int fam15h_power_init_data(struct pci_dev *f4,
 
data->max_cu_acc_power = tmp;
 
-   return 0;
+   return read_registers(data);
 }
 
 static int fam15h_power_probe(struct pci_dev *pdev,
-- 
1.9.1

[PATCH v6 5/6] hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm

2016-04-06 Thread Huang Rui

This patch adds the description to explain the TDP reporting mechanism
and accumulated power algorithm.

Signed-off-by: Huang Rui <ray.hu...@amd.com>
Cc: Borislav Petkov <b...@alien8.de>
---
 Documentation/hwmon/fam15h_power | 65 +++-
 drivers/hwmon/fam15h_power.c |  2 +-
 2 files changed, 65 insertions(+), 2 deletions(-)

diff --git a/Documentation/hwmon/fam15h_power b/Documentation/hwmon/fam15h_power
index e2b1b69..fb594c2 100644
--- a/Documentation/hwmon/fam15h_power
+++ b/Documentation/hwmon/fam15h_power
@@ -10,14 +10,22 @@ Supported chips:
   Datasheets:
   BIOS and Kernel Developer's Guide (BKDG) For AMD Family 15h Processors
   BIOS and Kernel Developer's Guide (BKDG) For AMD Family 16h Processors
+  AMD64 Architecture Programmer's Manual Volume 2: System Programming
 
 Author: Andreas Herrmann <herrmann.der.u...@googlemail.com>
 
 Description
 ---
 
+1) Processor TDP (Thermal design power)
+
+Given a fixed frequency and voltage, the power consumption of a
+processor varies based on the workload being executed. Derated power
+is the power consumed when running a specific application. Thermal
+design power (TDP) is an example of derated power.
+
 This driver permits reading of registers providing power information
-of AMD Family 15h and 16h processors.
+of AMD Family 15h and 16h processors via TDP algorithm.
 
 For AMD Family 15h and 16h processors the following power values can
 be calculated using different processor northbridge function
@@ -37,3 +45,58 @@ This driver provides ProcessorPwrWatts and CurrPwrWatts:
 On multi-node processors the calculated value is for the entire
 package and not for a single node. Thus the driver creates sysfs
 attributes only for internal node0 of a multi-node processor.
+
+2) Accumulated Power Mechanism
+
+This driver also introduces an algorithm that should be used to
+calculate the average power consumed by a processor during a
+measurement interval Tm. The feature of accumulated power mechanism is
+indicated by CPUID Fn8000_0007_EDX[12].
+
+* Tsample: compute unit power accumulator sample period
+* Tref: the PTSC counter period
+* PTSC: performance timestamp counter
+* N: the ratio of compute unit power accumulator sample period to the
+  PTSC period
+* Jmax: max compute unit accumulated power which is indicated by
+  MaxCpuSwPwrAcc MSR C001007b
+* Jx/Jy: compute unit accumulated power which is indicated by
+  CpuSwPwrAcc MSR C001007a
+* Tx/Ty: the value of performance timestamp counter which is indicated
+  by CU_PTSC MSR C0010280
+* PwrCPUave: CPU average power
+
+i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007.
+   N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]].
+
+ii. Read the full range of the cumulative energy value from the new
+MSR MaxCpuSwPwrAcc.
+   Jmax = value returned.
+iii. At time x, SW reads CpuSwPwrAcc MSR and samples the PTSC.
+   Jx = value read from CpuSwPwrAcc and Tx = value read from
+PTSC.
+
+iv. At time y, SW reads CpuSwPwrAcc MSR and samples the PTSC.
+   Jy = value read from CpuSwPwrAcc and Ty = value read from
+PTSC.
+
+v. Calculate the average power consumption for a compute unit over
+time period (y-x). Unit of result is uWatt.
+   if (Jy < Jx) // Rollover has occurred
+   Jdelta = (Jy + Jmax) - Jx
+   else
+   Jdelta = Jy - Jx
+   PwrCPUave = N * Jdelta * 1000 / (Ty - Tx)
+
+This driver provides PwrCPUave and interval(default is 10 millisecond
+and maximum is 1 second):
+* power1_average (PwrCPUave)
+* power1_average_interval (Interval)
+
+The power1_average_interval can be updated at /etc/sensors3.conf file
+as below:
+
+chip "fam15h_power-*"
+   set power1_average_interval 0.01
+
+Then save it with "sensors -s".
diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 5abbfa8..7d9d697 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -1,7 +1,7 @@
 /*
  * fam15h_power.c - AMD Family 15h processor power monitoring
  *
- * Copyright (c) 2011 Advanced Micro Devices, Inc.
+ * Copyright (c) 2011-2016 Advanced Micro Devices, Inc.
  * Author: Andreas Herrmann <herrmann.der.u...@googlemail.com>
  *
  *
-- 
1.9.1

[PATCH v6 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm

2016-04-06 Thread Huang Rui

This patch introduces an algorithm that computes the average power by
reading a delta value of “core power accumulator” register during
measurement interval, and then dividing delta value by the length of
the time interval.

User is able to use power1_average entry to measure the processor power
consumption and power1_average_interval entry to set the interval.

A simple example:

ray@hr-ub:~/tip$ sensors
fam15h_power-pci-00c4
Adapter: PCI adapter
power1:   19.58 mW (avg =   2.55 mW, interval =   0.01 s)
   (crit =  15.00 W)

...

The result is current average processor power consumption in 10
millisecond. The unit of the result is uWatt.

Suggested-by: Guenter Roeck <li...@roeck-us.net>
Signed-off-by: Huang Rui <ray.hu...@amd.com>
Cc: Borislav Petkov <b...@alien8.de>
---
 drivers/hwmon/fam15h_power.c | 128 +--
 1 file changed, 124 insertions(+), 4 deletions(-)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 336d422..5abbfa8 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -27,6 +27,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
@@ -48,6 +50,9 @@ MODULE_LICENSE("GPL");
 #define FAM15H_NUM_GROUPS  2
 #define MAX_CUS8
 
+/* set maximum interval as 1 second */
+#define MAX_INTERVAL   1000
+
 #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
 #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b
 #define MSR_F15H_PTSC  0xc0010280
@@ -68,6 +73,9 @@ struct fam15h_power_data {
u64 cu_acc_power[MAX_CUS];
/* performance timestamp counter */
u64 cpu_sw_pwr_ptsc[MAX_CUS];
+   /* online/offline status of current compute unit */
+   int cu_on[MAX_CUS];
+   unsigned long power_period;
 };
 
 static ssize_t show_power(struct device *dev,
@@ -149,6 +157,8 @@ static void do_read_registers_on_cu(void *_data)
 
rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]);
+
+   data->cu_on[cu] = 1;
 }
 
 /*
@@ -165,6 +175,8 @@ static int read_registers(struct fam15h_power_data *data)
if (!ret)
return -ENOMEM;
 
+   memset(data->cu_on, 0, sizeof(int) * MAX_CUS);
+
get_online_cpus();
this_cpu = smp_processor_id();
 
@@ -199,6 +211,98 @@ static int read_registers(struct fam15h_power_data *data)
return 0;
 }
 
+static ssize_t acc_show_power(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+   struct fam15h_power_data *data = dev_get_drvdata(dev);
+   u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS],
+   jdelta[MAX_CUS];
+   u64 tdelta, avg_acc;
+   int cu, cu_num, ret;
+   signed long leftover;
+
+   /*
+* With the new x86 topology modelling, x86_max_cores is the
+* compute unit number.
+*/
+   cu_num = boot_cpu_data.x86_max_cores;
+
+   ret = read_registers(data);
+   if (ret)
+   return 0;
+
+   for (cu = 0; cu < cu_num; cu++) {
+   prev_cu_acc_power[cu] = data->cu_acc_power[cu];
+   prev_ptsc[cu] = data->cpu_sw_pwr_ptsc[cu];
+   }
+
+   leftover = 
schedule_timeout_interruptible(msecs_to_jiffies(data->power_period));
+   if (leftover)
+   return 0;
+
+   ret = read_registers(data);
+   if (ret)
+   return 0;
+
+   for (cu = 0, avg_acc = 0; cu < cu_num; cu++) {
+   /* check if current compute unit is online */
+   if (data->cu_on[cu] == 0)
+   continue;
+
+   if (data->cu_acc_power[cu] < prev_cu_acc_power[cu]) {
+   jdelta[cu] = data->max_cu_acc_power + 
data->cu_acc_power[cu];
+   jdelta[cu] -= prev_cu_acc_power[cu];
+   } else {
+   jdelta[cu] = data->cu_acc_power[cu] - 
prev_cu_acc_power[cu];
+   }
+   tdelta = data->cpu_sw_pwr_ptsc[cu] - prev_ptsc[cu];
+   jdelta[cu] *= data->cpu_pwr_sample_ratio * 1000;
+   do_div(jdelta[cu], tdelta);
+
+   /* the unit is microWatt */
+   avg_acc += jdelta[cu];
+   }
+
+   return sprintf(buf, "%llu\n", (unsigned long long)avg_acc);
+}
+static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL);
+
+static ssize_t acc_show_power_period(struct device *dev,
+struct device_attribute *attr,
+char *buf)
+{
+   struct fam15h_power_data *data = dev_get_drvdata(dev);
+
+   return sprintf(buf, "%lu\n", data->power_period);
+}
+
+static ssize_t acc_

[PATCH v6 6/6] hwmon: (fam15h_power) Add platform check function

2016-04-06 Thread Huang Rui

This patch adds a platform check function to make code more readable.

Signed-off-by: Huang Rui <ray.hu...@amd.com>
---
 drivers/hwmon/fam15h_power.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 7d9d697..eb97a92 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -78,6 +78,11 @@ struct fam15h_power_data {
unsigned long power_period;
 };
 
+static bool is_carrizo_or_later(void)
+{
+   return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60;
+}
+
 static ssize_t show_power(struct device *dev,
  struct device_attribute *attr, char *buf)
 {
@@ -94,7 +99,7 @@ static ssize_t show_power(struct device *dev,
 * On Carrizo and later platforms, TdpRunAvgAccCap bit field
 * is extended to 4:31 from 4:25.
 */
-   if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60) {
+   if (is_carrizo_or_later()) {
running_avg_capture = val >> 4;
running_avg_capture = sign_extend32(running_avg_capture, 27);
} else {
@@ -111,7 +116,7 @@ static ssize_t show_power(struct device *dev,
 * On Carrizo and later platforms, ApmTdpLimit bit field
 * is extended to 16:31 from 16:28.
 */
-   if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60)
+   if (is_carrizo_or_later())
tdp_limit = val >> 16;
else
tdp_limit = (val >> 16) & 0x1fff;
-- 
1.9.1

[PATCH v6 5/6] hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm

2016-04-06 Thread Huang Rui

This patch adds the description to explain the TDP reporting mechanism
and accumulated power algorithm.

Signed-off-by: Huang Rui 
Cc: Borislav Petkov 
---
 Documentation/hwmon/fam15h_power | 65 +++-
 drivers/hwmon/fam15h_power.c |  2 +-
 2 files changed, 65 insertions(+), 2 deletions(-)

diff --git a/Documentation/hwmon/fam15h_power b/Documentation/hwmon/fam15h_power
index e2b1b69..fb594c2 100644
--- a/Documentation/hwmon/fam15h_power
+++ b/Documentation/hwmon/fam15h_power
@@ -10,14 +10,22 @@ Supported chips:
   Datasheets:
   BIOS and Kernel Developer's Guide (BKDG) For AMD Family 15h Processors
   BIOS and Kernel Developer's Guide (BKDG) For AMD Family 16h Processors
+  AMD64 Architecture Programmer's Manual Volume 2: System Programming
 
 Author: Andreas Herrmann 
 
 Description
 ---
 
+1) Processor TDP (Thermal design power)
+
+Given a fixed frequency and voltage, the power consumption of a
+processor varies based on the workload being executed. Derated power
+is the power consumed when running a specific application. Thermal
+design power (TDP) is an example of derated power.
+
 This driver permits reading of registers providing power information
-of AMD Family 15h and 16h processors.
+of AMD Family 15h and 16h processors via TDP algorithm.
 
 For AMD Family 15h and 16h processors the following power values can
 be calculated using different processor northbridge function
@@ -37,3 +45,58 @@ This driver provides ProcessorPwrWatts and CurrPwrWatts:
 On multi-node processors the calculated value is for the entire
 package and not for a single node. Thus the driver creates sysfs
 attributes only for internal node0 of a multi-node processor.
+
+2) Accumulated Power Mechanism
+
+This driver also introduces an algorithm that should be used to
+calculate the average power consumed by a processor during a
+measurement interval Tm. The feature of accumulated power mechanism is
+indicated by CPUID Fn8000_0007_EDX[12].
+
+* Tsample: compute unit power accumulator sample period
+* Tref: the PTSC counter period
+* PTSC: performance timestamp counter
+* N: the ratio of compute unit power accumulator sample period to the
+  PTSC period
+* Jmax: max compute unit accumulated power which is indicated by
+  MaxCpuSwPwrAcc MSR C001007b
+* Jx/Jy: compute unit accumulated power which is indicated by
+  CpuSwPwrAcc MSR C001007a
+* Tx/Ty: the value of performance timestamp counter which is indicated
+  by CU_PTSC MSR C0010280
+* PwrCPUave: CPU average power
+
+i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007.
+   N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]].
+
+ii. Read the full range of the cumulative energy value from the new
+MSR MaxCpuSwPwrAcc.
+   Jmax = value returned.
+iii. At time x, SW reads CpuSwPwrAcc MSR and samples the PTSC.
+   Jx = value read from CpuSwPwrAcc and Tx = value read from
+PTSC.
+
+iv. At time y, SW reads CpuSwPwrAcc MSR and samples the PTSC.
+   Jy = value read from CpuSwPwrAcc and Ty = value read from
+PTSC.
+
+v. Calculate the average power consumption for a compute unit over
+time period (y-x). Unit of result is uWatt.
+   if (Jy < Jx) // Rollover has occurred
+   Jdelta = (Jy + Jmax) - Jx
+   else
+   Jdelta = Jy - Jx
+   PwrCPUave = N * Jdelta * 1000 / (Ty - Tx)
+
+This driver provides PwrCPUave and interval(default is 10 millisecond
+and maximum is 1 second):
+* power1_average (PwrCPUave)
+* power1_average_interval (Interval)
+
+The power1_average_interval can be updated at /etc/sensors3.conf file
+as below:
+
+chip "fam15h_power-*"
+   set power1_average_interval 0.01
+
+Then save it with "sensors -s".
diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 5abbfa8..7d9d697 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -1,7 +1,7 @@
 /*
  * fam15h_power.c - AMD Family 15h processor power monitoring
  *
- * Copyright (c) 2011 Advanced Micro Devices, Inc.
+ * Copyright (c) 2011-2016 Advanced Micro Devices, Inc.
  * Author: Andreas Herrmann 
  *
  *
-- 
1.9.1

[PATCH v6 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm

2016-04-06 Thread Huang Rui

This patch introduces an algorithm that computes the average power by
reading a delta value of “core power accumulator” register during
measurement interval, and then dividing delta value by the length of
the time interval.

User is able to use power1_average entry to measure the processor power
consumption and power1_average_interval entry to set the interval.

A simple example:

ray@hr-ub:~/tip$ sensors
fam15h_power-pci-00c4
Adapter: PCI adapter
power1:   19.58 mW (avg =   2.55 mW, interval =   0.01 s)
   (crit =  15.00 W)

...

The result is current average processor power consumption in 10
millisecond. The unit of the result is uWatt.

Suggested-by: Guenter Roeck 
Signed-off-by: Huang Rui 
Cc: Borislav Petkov 
---
 drivers/hwmon/fam15h_power.c | 128 +--
 1 file changed, 124 insertions(+), 4 deletions(-)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 336d422..5abbfa8 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -27,6 +27,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
@@ -48,6 +50,9 @@ MODULE_LICENSE("GPL");
 #define FAM15H_NUM_GROUPS  2
 #define MAX_CUS8
 
+/* set maximum interval as 1 second */
+#define MAX_INTERVAL   1000
+
 #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
 #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b
 #define MSR_F15H_PTSC  0xc0010280
@@ -68,6 +73,9 @@ struct fam15h_power_data {
u64 cu_acc_power[MAX_CUS];
/* performance timestamp counter */
u64 cpu_sw_pwr_ptsc[MAX_CUS];
+   /* online/offline status of current compute unit */
+   int cu_on[MAX_CUS];
+   unsigned long power_period;
 };
 
 static ssize_t show_power(struct device *dev,
@@ -149,6 +157,8 @@ static void do_read_registers_on_cu(void *_data)
 
rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]);
+
+   data->cu_on[cu] = 1;
 }
 
 /*
@@ -165,6 +175,8 @@ static int read_registers(struct fam15h_power_data *data)
if (!ret)
return -ENOMEM;
 
+   memset(data->cu_on, 0, sizeof(int) * MAX_CUS);
+
get_online_cpus();
this_cpu = smp_processor_id();
 
@@ -199,6 +211,98 @@ static int read_registers(struct fam15h_power_data *data)
return 0;
 }
 
+static ssize_t acc_show_power(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+   struct fam15h_power_data *data = dev_get_drvdata(dev);
+   u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS],
+   jdelta[MAX_CUS];
+   u64 tdelta, avg_acc;
+   int cu, cu_num, ret;
+   signed long leftover;
+
+   /*
+* With the new x86 topology modelling, x86_max_cores is the
+* compute unit number.
+*/
+   cu_num = boot_cpu_data.x86_max_cores;
+
+   ret = read_registers(data);
+   if (ret)
+   return 0;
+
+   for (cu = 0; cu < cu_num; cu++) {
+   prev_cu_acc_power[cu] = data->cu_acc_power[cu];
+   prev_ptsc[cu] = data->cpu_sw_pwr_ptsc[cu];
+   }
+
+   leftover = 
schedule_timeout_interruptible(msecs_to_jiffies(data->power_period));
+   if (leftover)
+   return 0;
+
+   ret = read_registers(data);
+   if (ret)
+   return 0;
+
+   for (cu = 0, avg_acc = 0; cu < cu_num; cu++) {
+   /* check if current compute unit is online */
+   if (data->cu_on[cu] == 0)
+   continue;
+
+   if (data->cu_acc_power[cu] < prev_cu_acc_power[cu]) {
+   jdelta[cu] = data->max_cu_acc_power + 
data->cu_acc_power[cu];
+   jdelta[cu] -= prev_cu_acc_power[cu];
+   } else {
+   jdelta[cu] = data->cu_acc_power[cu] - 
prev_cu_acc_power[cu];
+   }
+   tdelta = data->cpu_sw_pwr_ptsc[cu] - prev_ptsc[cu];
+   jdelta[cu] *= data->cpu_pwr_sample_ratio * 1000;
+   do_div(jdelta[cu], tdelta);
+
+   /* the unit is microWatt */
+   avg_acc += jdelta[cu];
+   }
+
+   return sprintf(buf, "%llu\n", (unsigned long long)avg_acc);
+}
+static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL);
+
+static ssize_t acc_show_power_period(struct device *dev,
+struct device_attribute *attr,
+char *buf)
+{
+   struct fam15h_power_data *data = dev_get_drvdata(dev);
+
+   return sprintf(buf, "%lu\n", data->power_period);
+}
+
+static ssize_t acc_set_power_period(struct device *dev,
+   struct device_attribute *attr,

[PATCH v6 6/6] hwmon: (fam15h_power) Add platform check function

2016-04-06 Thread Huang Rui

This patch adds a platform check function to make code more readable.

Signed-off-by: Huang Rui 
---
 drivers/hwmon/fam15h_power.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 7d9d697..eb97a92 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -78,6 +78,11 @@ struct fam15h_power_data {
unsigned long power_period;
 };
 
+static bool is_carrizo_or_later(void)
+{
+   return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60;
+}
+
 static ssize_t show_power(struct device *dev,
  struct device_attribute *attr, char *buf)
 {
@@ -94,7 +99,7 @@ static ssize_t show_power(struct device *dev,
 * On Carrizo and later platforms, TdpRunAvgAccCap bit field
 * is extended to 4:31 from 4:25.
 */
-   if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60) {
+   if (is_carrizo_or_later()) {
running_avg_capture = val >> 4;
running_avg_capture = sign_extend32(running_avg_capture, 27);
} else {
@@ -111,7 +116,7 @@ static ssize_t show_power(struct device *dev,
 * On Carrizo and later platforms, ApmTdpLimit bit field
 * is extended to 16:31 from 16:28.
 */
-   if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60)
+   if (is_carrizo_or_later())
tdp_limit = val >> 16;
else
tdp_limit = (val >> 16) & 0x1fff;
-- 
1.9.1

[PATCH v6 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm

2016-04-06 Thread Huang Rui


Hi Guenter,

This serial of patches introduces an accumulated power reporting
algorithm. It will calculate the average power consumption for the
processor. The cpu feature flag is CPUID.8000_0007H:EDX[12].

This algorithm is used to test the comparison of processor power
consumption with between MWAITX delay and TSC delay on AMD Carrizo
platforms.

Reference:
https://lkml.kernel.org/r/1438744732-1459-1-git-send-email-ray.hu...@amd.com

Commit f96756 at tip ("x86/asm: Add MONITORX/MWAITX instruction support")
Commit b466bd at tip ("x86/asm/delay: Introduce an MWAITX-based delay with a 
configurable timer")

V1: 
https://lkml.kernel.org/r/1440662866-28716-1-git-send-email-ray.hu...@amd.com
V2: 
https://lkml.kernel.org/r/1445308109-17970-1-git-send-email-ray.hu...@amd.com
V3: https://lkml.kernel.org/r/1446199024-1896-1-git-send-email-ray.hu...@amd.com
V4: https://lkml.kernel.org/r/1457662670-3354-1-git-send-email-ray.hu...@amd.com
V5: https://lkml.kernel.org/r/1459143136-2412-1-git-send-email-ray.hu...@amd.com

On V5, I used the new x86 topology with compute unit id (cpu_core_id)
and compute unit numbers (x86_max_cores) for AMD processors. The
change and documentation will go to master soon as below link. And
This way is better to use smp_num_siblings which is un-defined
variable out of CONFIG_SMP.

On V6, Guenter, I suggest that you merge the v4.6-rc2 into your
hwmon-next branch. Because there are two commits (ee6825, 8196da)
which are the dependences of this patch set to fix the AMD compute
unit and max cores things. And you can see the new topology info at
Documentation/x86/topology.txt.

Changes from v1 -> v2:
- Move fam15h_power_groups and fam15h_power_group into fam15h_power_data to
  avoid overwrite on multi-CPU system.
- Rename FAM15H_MIN_NUM_ATTRS macro and fix return error code.
- Remove unnecessary warning print.
- Adds do_read_registers_on_cu to do all the read to all MSRs and run it on one
  of the online cores on each compute unit with smp_call_function_many().
- Use power1_average and power1_average_interval standard entry
  instread of power1_acc
- Fix the CPU-hotplug case.

Changes from v2 -> v3:
- As Guenter's suggestion, remove typecast, use >groups[0].
- Remove all "fam15_power_*" prefix at data.
- Remove unnecessary ( ).
- Fix the issue that is reported by build test robot, and add
  CPU_SUP_AMD as the dependence of fam15h_power
- Remove the WARN_ON at do_read_registers_on_cu, because it must be
  behind CPUID check. The MSR must be available since
  CPUID.8000_0007H:EDX[12] is set 
- Add get_online_cpus()/put_online_cpus() functions.
- Refine commments and the method which generate cpumask for cu.
- Add the interval scope to make the value suitable for user
  experience
- Remove the useless mutex.

Changes from v3 -> v4:
- Rebase the patches to latest groeck/hwmon-next.
- Use smp_num_siblings instead of cores_per_cu accessor.
- Refine the cpumask method which is inspired by perf solution.
- Fix some typo and errors.

Changes from v4 -> v5:
- Rebase patches to v4.6-rc1.
- Use new x86 topology with compute id (cpu_core_id) and compute unit
  number (x86_max_cores) instead of smp_num_siblings.

Changes from v5 -> v6:
- Remove unuse get_cpu/put_cpu.
- Update cpumask handling method with new topology.
- Use X86_FEATURE_ACC_POWER flag instead cpuid
- Add comments to explain default interval value and add documentation
  to explain how to modify the interval.

A simple example:

ray@hr-ub:~/tip$ sensors
fam15h_power-pci-00c4
Adapter: PCI adapter
power1:   19.58 mW (avg =   2.55 mW, interval =   0.01 s)
   (crit =  15.00 W)

...

These patches are rebased on v4.6-rc2.

Thanks,
Rui


Huang Rui (6):
  hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence
  hwmon: (fam15h_power) Add compute unit accumulated power
  hwmon: (fam15h_power) Add ptsc counter value for accumulated power
  hwmon: (fam15h_power) Introduce a cpu accumulated power reporting
algorithm
  hwmon: (fam15h_power) Add documentation for TDP and accumulated power
algorithm
  hwmon: (fam15h_power) Add platform check function

 Documentation/hwmon/fam15h_power |  65 +++-
 drivers/hwmon/Kconfig|   2 +-
 drivers/hwmon/fam15h_power.c | 215 +--
 3 files changed, 272 insertions(+), 10 deletions(-)

-- 
1.9.1

[PATCH v6 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm

2016-04-06 Thread Huang Rui


Hi Guenter,

This serial of patches introduces an accumulated power reporting
algorithm. It will calculate the average power consumption for the
processor. The cpu feature flag is CPUID.8000_0007H:EDX[12].

This algorithm is used to test the comparison of processor power
consumption with between MWAITX delay and TSC delay on AMD Carrizo
platforms.

Reference:
https://lkml.kernel.org/r/1438744732-1459-1-git-send-email-ray.hu...@amd.com

Commit f96756 at tip ("x86/asm: Add MONITORX/MWAITX instruction support")
Commit b466bd at tip ("x86/asm/delay: Introduce an MWAITX-based delay with a 
configurable timer")

V1: 
https://lkml.kernel.org/r/1440662866-28716-1-git-send-email-ray.hu...@amd.com
V2: 
https://lkml.kernel.org/r/1445308109-17970-1-git-send-email-ray.hu...@amd.com
V3: https://lkml.kernel.org/r/1446199024-1896-1-git-send-email-ray.hu...@amd.com
V4: https://lkml.kernel.org/r/1457662670-3354-1-git-send-email-ray.hu...@amd.com
V5: https://lkml.kernel.org/r/1459143136-2412-1-git-send-email-ray.hu...@amd.com

On V5, I used the new x86 topology with compute unit id (cpu_core_id)
and compute unit numbers (x86_max_cores) for AMD processors. The
change and documentation will go to master soon as below link. And
This way is better to use smp_num_siblings which is un-defined
variable out of CONFIG_SMP.

On V6, Guenter, I suggest that you merge the v4.6-rc2 into your
hwmon-next branch. Because there are two commits (ee6825, 8196da)
which are the dependences of this patch set to fix the AMD compute
unit and max cores things. And you can see the new topology info at
Documentation/x86/topology.txt.

Changes from v1 -> v2:
- Move fam15h_power_groups and fam15h_power_group into fam15h_power_data to
  avoid overwrite on multi-CPU system.
- Rename FAM15H_MIN_NUM_ATTRS macro and fix return error code.
- Remove unnecessary warning print.
- Adds do_read_registers_on_cu to do all the read to all MSRs and run it on one
  of the online cores on each compute unit with smp_call_function_many().
- Use power1_average and power1_average_interval standard entry
  instread of power1_acc
- Fix the CPU-hotplug case.

Changes from v2 -> v3:
- As Guenter's suggestion, remove typecast, use >groups[0].
- Remove all "fam15_power_*" prefix at data.
- Remove unnecessary ( ).
- Fix the issue that is reported by build test robot, and add
  CPU_SUP_AMD as the dependence of fam15h_power
- Remove the WARN_ON at do_read_registers_on_cu, because it must be
  behind CPUID check. The MSR must be available since
  CPUID.8000_0007H:EDX[12] is set 
- Add get_online_cpus()/put_online_cpus() functions.
- Refine commments and the method which generate cpumask for cu.
- Add the interval scope to make the value suitable for user
  experience
- Remove the useless mutex.

Changes from v3 -> v4:
- Rebase the patches to latest groeck/hwmon-next.
- Use smp_num_siblings instead of cores_per_cu accessor.
- Refine the cpumask method which is inspired by perf solution.
- Fix some typo and errors.

Changes from v4 -> v5:
- Rebase patches to v4.6-rc1.
- Use new x86 topology with compute id (cpu_core_id) and compute unit
  number (x86_max_cores) instead of smp_num_siblings.

Changes from v5 -> v6:
- Remove unuse get_cpu/put_cpu.
- Update cpumask handling method with new topology.
- Use X86_FEATURE_ACC_POWER flag instead cpuid
- Add comments to explain default interval value and add documentation
  to explain how to modify the interval.

A simple example:

ray@hr-ub:~/tip$ sensors
fam15h_power-pci-00c4
Adapter: PCI adapter
power1:   19.58 mW (avg =   2.55 mW, interval =   0.01 s)
   (crit =  15.00 W)

...

These patches are rebased on v4.6-rc2.

Thanks,
Rui


Huang Rui (6):
  hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence
  hwmon: (fam15h_power) Add compute unit accumulated power
  hwmon: (fam15h_power) Add ptsc counter value for accumulated power
  hwmon: (fam15h_power) Introduce a cpu accumulated power reporting
algorithm
  hwmon: (fam15h_power) Add documentation for TDP and accumulated power
algorithm
  hwmon: (fam15h_power) Add platform check function

 Documentation/hwmon/fam15h_power |  65 +++-
 drivers/hwmon/Kconfig|   2 +-
 drivers/hwmon/fam15h_power.c | 215 +--
 3 files changed, 272 insertions(+), 10 deletions(-)

-- 
1.9.1

[PATCH v6 1/6] hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence

2016-04-06 Thread Huang Rui

This patch adds CONFIG_CPU_SUP_AMD as the dependence of fam15h_power
driver. Because the following patch will use the interface from
x86/kernel/cpu/amd.c.

Otherwise, the below error might be encountered:

All errors (new ones prefixed by >>):

   drivers/built-in.o: In function `fam15h_power_probe':
>> fam15h_power.c:(.text+0x26e3a3): undefined reference to
>> `amd_get_cores_per_cu'
   fam15h_power.c:(.text+0x26e41e): undefined reference to
`amd_get_cores_per_cu'

Reported-by: build test robot <l...@intel.com>
Acked-by: Borislav Petkov <b...@suse.de>
Signed-off-by: Huang Rui <ray.hu...@amd.com>
---
 drivers/hwmon/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index 5c2d13a..4be3792 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -288,7 +288,7 @@ config SENSORS_K10TEMP
 
 config SENSORS_FAM15H_POWER
tristate "AMD Family 15h processor power"
-   depends on X86 && PCI
+   depends on X86 && PCI && CPU_SUP_AMD
help
  If you say yes here you get support for processor power
  information of your AMD family 15h CPU.
-- 
1.9.1

[PATCH v6 1/6] hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence

2016-04-06 Thread Huang Rui

This patch adds CONFIG_CPU_SUP_AMD as the dependence of fam15h_power
driver. Because the following patch will use the interface from
x86/kernel/cpu/amd.c.

Otherwise, the below error might be encountered:

All errors (new ones prefixed by >>):

   drivers/built-in.o: In function `fam15h_power_probe':
>> fam15h_power.c:(.text+0x26e3a3): undefined reference to
>> `amd_get_cores_per_cu'
   fam15h_power.c:(.text+0x26e41e): undefined reference to
`amd_get_cores_per_cu'

Reported-by: build test robot 
Acked-by: Borislav Petkov 
Signed-off-by: Huang Rui 
---
 drivers/hwmon/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index 5c2d13a..4be3792 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -288,7 +288,7 @@ config SENSORS_K10TEMP
 
 config SENSORS_FAM15H_POWER
tristate "AMD Family 15h processor power"
-   depends on X86 && PCI
+   depends on X86 && PCI && CPU_SUP_AMD
help
  If you say yes here you get support for processor power
  information of your AMD family 15h CPU.
-- 
1.9.1

[tip:perf/core] perf/x86/msr: Add AMD PTSC (Performance Time-Stamp Counter) support

2016-03-31 Thread tip-bot for Huang Rui

Commit-ID:  8a22426184774d7ced9c1d3aa4d95d34101fb3be
Gitweb: http://git.kernel.org/tip/8a22426184774d7ced9c1d3aa4d95d34101fb3be
Author: Huang Rui <ray.hu...@amd.com>
AuthorDate: Fri, 29 Jan 2016 16:29:56 +0800
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Thu, 31 Mar 2016 10:30:39 +0200

perf/x86/msr: Add AMD PTSC (Performance Time-Stamp Counter) support

AMD Carrizo (Family 15h, Model 60h) introduces a time-stamp counter
which is indicated by CPUID.8000_0001H:ECX[27]. It increments at a 100
MHz rate in all P-states, and C states, S0, or S1. The frequency is
about 100MHz. This counter will be used to calculate processor power
and other parts. So add an interface into the MSR PMU to get the PTSC
counter value.

Signed-off-by: Huang Rui <ray.hu...@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andy Lutomirski <l...@amacapital.net>
Cc: Aravind Gopalakrishnan <aravind.gopalakrish...@amd.com>
Cc: Arnaldo Carvalho de Melo <a...@kernel.org>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Borislav Petkov <b...@suse.de>
Cc: Fengguang Wu <fengguang...@intel.com>
Cc: Jacob Shin <jacob.w.s...@gmail.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Robert Richter <r...@kernel.org>
Cc: Stephane Eranian <eran...@google.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpa...@amd.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Vince Weaver <vincent.wea...@maine.edu>
Link: 
http://lkml.kernel.org/r/1454056197-5893-2-git-send-email-ray.hu...@amd.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/events/msr.c  | 8 
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/include/asm/msr-index.h   | 1 +
 3 files changed, 10 insertions(+)

diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c
index ec863b9..6f6772f 100644
--- a/arch/x86/events/msr.c
+++ b/arch/x86/events/msr.c
@@ -6,6 +6,7 @@ enum perf_msr_id {
PERF_MSR_MPERF  = 2,
PERF_MSR_PPERF  = 3,
PERF_MSR_SMI= 4,
+   PERF_MSR_PTSC   = 5,
 
PERF_MSR_EVENT_MAX,
 };
@@ -15,6 +16,11 @@ static bool test_aperfmperf(int idx)
return boot_cpu_has(X86_FEATURE_APERFMPERF);
 }
 
+static bool test_ptsc(int idx)
+{
+   return boot_cpu_has(X86_FEATURE_PTSC);
+}
+
 static bool test_intel(int idx)
 {
if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
@@ -74,6 +80,7 @@ PMU_EVENT_ATTR_STRING(aperf, evattr_aperf, "event=0x01");
 PMU_EVENT_ATTR_STRING(mperf, evattr_mperf, "event=0x02");
 PMU_EVENT_ATTR_STRING(pperf, evattr_pperf, "event=0x03");
 PMU_EVENT_ATTR_STRING(smi,   evattr_smi,   "event=0x04");
+PMU_EVENT_ATTR_STRING(ptsc,  evattr_ptsc,  "event=0x05");
 
 static struct perf_msr msr[] = {
[PERF_MSR_TSC]   = { 0, _tsc,NULL,   
 },
@@ -81,6 +88,7 @@ static struct perf_msr msr[] = {
[PERF_MSR_MPERF] = { MSR_IA32_MPERF,_mperf,  
test_aperfmperf, },
[PERF_MSR_PPERF] = { MSR_PPERF, _pperf,  test_intel, 
 },
[PERF_MSR_SMI]   = { MSR_SMI_COUNT, _smi,test_intel, 
 },
+   [PERF_MSR_PTSC]   = { MSR_F15H_PTSC,_ptsc,   test_ptsc,  
 },
 };
 
 static struct attribute *events_attrs[PERF_MSR_EVENT_MAX + 1] = {
diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 44ebd04..bdf9042 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -177,6 +177,7 @@
 #define X86_FEATURE_PERFCTR_CORE ( 6*32+23) /* core performance counter 
extensions */
 #define X86_FEATURE_PERFCTR_NB  ( 6*32+24) /* NB performance counter 
extensions */
 #define X86_FEATURE_BPEXT  (6*32+26) /* data breakpoint extension */
+#define X86_FEATURE_PTSC   ( 6*32+27) /* performance time-stamp counter */
 #define X86_FEATURE_PERFCTR_L2 ( 6*32+28) /* L2 performance counter extensions 
*/
 #define X86_FEATURE_MWAITX ( 6*32+29) /* MWAIT extension (MONITORX/MWAITX) 
*/
 
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 984ab75..6e6a5cc 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -326,6 +326,7 @@
 #define MSR_F15H_PERF_CTR  0xc0010201
 #define MSR_F15H_NB_PERF_CTL   0xc0010240
 #define MSR_F15H_NB_PERF_CTR   0xc0010241
+#define MSR_F15H_PTSC  0xc0010280
 #define MSR_F15H_IC_CFG0xc0011021
 
 /* Fam 10h MSRs */

[tip:perf/core] perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter

2016-03-31 Thread tip-bot for Huang Rui

Commit-ID:  aaf248848db503927644d28e239bc399ed45959f
Gitweb: http://git.kernel.org/tip/aaf248848db503927644d28e239bc399ed45959f
Author: Huang Rui <ray.hu...@amd.com>
AuthorDate: Fri, 29 Jan 2016 16:29:57 +0800
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Thu, 31 Mar 2016 10:30:39 +0200

perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter

AMD Zeppelin (Family 17h, Model 00h) introduces an instructions
retired performance counter which is indicated by
CPUID.8000_0008H:EBX[1]. A dedicated Instructions Retired MSR register
(MSR 0xC000_000E9) increments once for every instruction retired.

Signed-off-by: Huang Rui <ray.hu...@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andy Lutomirski <l...@amacapital.net>
Cc: Aravind Gopalakrishnan <aravind.gopalakrish...@amd.com>
Cc: Arnaldo Carvalho de Melo <a...@kernel.org>
Cc: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Borislav Petkov <b...@suse.de>
Cc: Fengguang Wu <fengguang...@intel.com>
Cc: Jacob Shin <jacob.w.s...@gmail.com>
Cc: Jiri Olsa <jo...@redhat.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Robert Richter <r...@kernel.org>
Cc: Stephane Eranian <eran...@google.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpa...@amd.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Vince Weaver <vincent.wea...@maine.edu>
Link: 
http://lkml.kernel.org/r/1454056197-5893-3-git-send-email-ray.hu...@amd.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/events/msr.c  | 30 +++---
 arch/x86/include/asm/cpufeatures.h |  1 +
 arch/x86/include/asm/msr-index.h   |  3 +++
 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c
index 6f6772f..7111400 100644
--- a/arch/x86/events/msr.c
+++ b/arch/x86/events/msr.c
@@ -7,6 +7,7 @@ enum perf_msr_id {
PERF_MSR_PPERF  = 3,
PERF_MSR_SMI= 4,
PERF_MSR_PTSC   = 5,
+   PERF_MSR_IRPERF = 6,
 
PERF_MSR_EVENT_MAX,
 };
@@ -21,6 +22,11 @@ static bool test_ptsc(int idx)
return boot_cpu_has(X86_FEATURE_PTSC);
 }
 
+static bool test_irperf(int idx)
+{
+   return boot_cpu_has(X86_FEATURE_IRPERF);
+}
+
 static bool test_intel(int idx)
 {
if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
@@ -75,20 +81,22 @@ struct perf_msr {
bool(*test)(int idx);
 };
 
-PMU_EVENT_ATTR_STRING(tsc,   evattr_tsc,   "event=0x00");
-PMU_EVENT_ATTR_STRING(aperf, evattr_aperf, "event=0x01");
-PMU_EVENT_ATTR_STRING(mperf, evattr_mperf, "event=0x02");
-PMU_EVENT_ATTR_STRING(pperf, evattr_pperf, "event=0x03");
-PMU_EVENT_ATTR_STRING(smi,   evattr_smi,   "event=0x04");
-PMU_EVENT_ATTR_STRING(ptsc,  evattr_ptsc,  "event=0x05");
+PMU_EVENT_ATTR_STRING(tsc,evattr_tsc,"event=0x00");
+PMU_EVENT_ATTR_STRING(aperf,  evattr_aperf,  "event=0x01");
+PMU_EVENT_ATTR_STRING(mperf,  evattr_mperf,  "event=0x02");
+PMU_EVENT_ATTR_STRING(pperf,  evattr_pperf,  "event=0x03");
+PMU_EVENT_ATTR_STRING(smi,evattr_smi,"event=0x04");
+PMU_EVENT_ATTR_STRING(ptsc,   evattr_ptsc,   "event=0x05");
+PMU_EVENT_ATTR_STRING(irperf, evattr_irperf, "event=0x06");
 
 static struct perf_msr msr[] = {
-   [PERF_MSR_TSC]   = { 0, _tsc,NULL,   
 },
-   [PERF_MSR_APERF] = { MSR_IA32_APERF,_aperf,  
test_aperfmperf, },
-   [PERF_MSR_MPERF] = { MSR_IA32_MPERF,_mperf,  
test_aperfmperf, },
-   [PERF_MSR_PPERF] = { MSR_PPERF, _pperf,  test_intel, 
 },
-   [PERF_MSR_SMI]   = { MSR_SMI_COUNT, _smi,test_intel, 
 },
+   [PERF_MSR_TSC]= { 0,_tsc,NULL,   
 },
+   [PERF_MSR_APERF]  = { MSR_IA32_APERF,   _aperf,  
test_aperfmperf, },
+   [PERF_MSR_MPERF]  = { MSR_IA32_MPERF,   _mperf,  
test_aperfmperf, },
+   [PERF_MSR_PPERF]  = { MSR_PPERF,_pperf,  test_intel, 
 },
+   [PERF_MSR_SMI]= { MSR_SMI_COUNT,_smi,test_intel, 
 },
[PERF_MSR_PTSC]   = { MSR_F15H_PTSC,_ptsc,   test_ptsc,  
 },
+   [PERF_MSR_IRPERF] = { MSR_F17H_IRPERF,  _irperf, test_irperf,
 },
 };
 
 static struct attribute *events_attrs[PERF_MSR_EVENT_MAX + 1] = {
diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index bdf9042..dd448a9 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -251,6 +251,7 @@
 
 /* AMD-defined CPU features, CPUID

[tip:perf/core] perf/x86/msr: Add AMD PTSC (Performance Time-Stamp Counter) support

2016-03-31 Thread tip-bot for Huang Rui

Commit-ID:  8a22426184774d7ced9c1d3aa4d95d34101fb3be
Gitweb: http://git.kernel.org/tip/8a22426184774d7ced9c1d3aa4d95d34101fb3be
Author: Huang Rui 
AuthorDate: Fri, 29 Jan 2016 16:29:56 +0800
Committer:  Ingo Molnar 
CommitDate: Thu, 31 Mar 2016 10:30:39 +0200

perf/x86/msr: Add AMD PTSC (Performance Time-Stamp Counter) support

AMD Carrizo (Family 15h, Model 60h) introduces a time-stamp counter
which is indicated by CPUID.8000_0001H:ECX[27]. It increments at a 100
MHz rate in all P-states, and C states, S0, or S1. The frequency is
about 100MHz. This counter will be used to calculate processor power
and other parts. So add an interface into the MSR PMU to get the PTSC
counter value.

Signed-off-by: Huang Rui 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Alexander Shishkin 
Cc: Andy Lutomirski 
Cc: Aravind Gopalakrishnan 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: Borislav Petkov 
Cc: Borislav Petkov 
Cc: Fengguang Wu 
Cc: Jacob Shin 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Robert Richter 
Cc: Stephane Eranian 
Cc: Suravee Suthikulpanit 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Link: 
http://lkml.kernel.org/r/1454056197-5893-2-git-send-email-ray.hu...@amd.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/events/msr.c  | 8 
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/include/asm/msr-index.h   | 1 +
 3 files changed, 10 insertions(+)

diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c
index ec863b9..6f6772f 100644
--- a/arch/x86/events/msr.c
+++ b/arch/x86/events/msr.c
@@ -6,6 +6,7 @@ enum perf_msr_id {
PERF_MSR_MPERF  = 2,
PERF_MSR_PPERF  = 3,
PERF_MSR_SMI= 4,
+   PERF_MSR_PTSC   = 5,
 
PERF_MSR_EVENT_MAX,
 };
@@ -15,6 +16,11 @@ static bool test_aperfmperf(int idx)
return boot_cpu_has(X86_FEATURE_APERFMPERF);
 }
 
+static bool test_ptsc(int idx)
+{
+   return boot_cpu_has(X86_FEATURE_PTSC);
+}
+
 static bool test_intel(int idx)
 {
if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
@@ -74,6 +80,7 @@ PMU_EVENT_ATTR_STRING(aperf, evattr_aperf, "event=0x01");
 PMU_EVENT_ATTR_STRING(mperf, evattr_mperf, "event=0x02");
 PMU_EVENT_ATTR_STRING(pperf, evattr_pperf, "event=0x03");
 PMU_EVENT_ATTR_STRING(smi,   evattr_smi,   "event=0x04");
+PMU_EVENT_ATTR_STRING(ptsc,  evattr_ptsc,  "event=0x05");
 
 static struct perf_msr msr[] = {
[PERF_MSR_TSC]   = { 0, _tsc,NULL,   
 },
@@ -81,6 +88,7 @@ static struct perf_msr msr[] = {
[PERF_MSR_MPERF] = { MSR_IA32_MPERF,_mperf,  
test_aperfmperf, },
[PERF_MSR_PPERF] = { MSR_PPERF, _pperf,  test_intel, 
 },
[PERF_MSR_SMI]   = { MSR_SMI_COUNT, _smi,test_intel, 
 },
+   [PERF_MSR_PTSC]   = { MSR_F15H_PTSC,_ptsc,   test_ptsc,  
 },
 };
 
 static struct attribute *events_attrs[PERF_MSR_EVENT_MAX + 1] = {
diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 44ebd04..bdf9042 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -177,6 +177,7 @@
 #define X86_FEATURE_PERFCTR_CORE ( 6*32+23) /* core performance counter 
extensions */
 #define X86_FEATURE_PERFCTR_NB  ( 6*32+24) /* NB performance counter 
extensions */
 #define X86_FEATURE_BPEXT  (6*32+26) /* data breakpoint extension */
+#define X86_FEATURE_PTSC   ( 6*32+27) /* performance time-stamp counter */
 #define X86_FEATURE_PERFCTR_L2 ( 6*32+28) /* L2 performance counter extensions 
*/
 #define X86_FEATURE_MWAITX ( 6*32+29) /* MWAIT extension (MONITORX/MWAITX) 
*/
 
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 984ab75..6e6a5cc 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -326,6 +326,7 @@
 #define MSR_F15H_PERF_CTR  0xc0010201
 #define MSR_F15H_NB_PERF_CTL   0xc0010240
 #define MSR_F15H_NB_PERF_CTR   0xc0010241
+#define MSR_F15H_PTSC  0xc0010280
 #define MSR_F15H_IC_CFG0xc0011021
 
 /* Fam 10h MSRs */

[tip:perf/core] perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter

2016-03-31 Thread tip-bot for Huang Rui

Commit-ID:  aaf248848db503927644d28e239bc399ed45959f
Gitweb: http://git.kernel.org/tip/aaf248848db503927644d28e239bc399ed45959f
Author: Huang Rui 
AuthorDate: Fri, 29 Jan 2016 16:29:57 +0800
Committer:  Ingo Molnar 
CommitDate: Thu, 31 Mar 2016 10:30:39 +0200

perf/x86/msr: Add AMD IRPERF (Instructions Retired) performance counter

AMD Zeppelin (Family 17h, Model 00h) introduces an instructions
retired performance counter which is indicated by
CPUID.8000_0008H:EBX[1]. A dedicated Instructions Retired MSR register
(MSR 0xC000_000E9) increments once for every instruction retired.

Signed-off-by: Huang Rui 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Alexander Shishkin 
Cc: Andy Lutomirski 
Cc: Aravind Gopalakrishnan 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnaldo Carvalho de Melo 
Cc: Borislav Petkov 
Cc: Borislav Petkov 
Cc: Fengguang Wu 
Cc: Jacob Shin 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Robert Richter 
Cc: Stephane Eranian 
Cc: Suravee Suthikulpanit 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Link: 
http://lkml.kernel.org/r/1454056197-5893-3-git-send-email-ray.hu...@amd.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/events/msr.c  | 30 +++---
 arch/x86/include/asm/cpufeatures.h |  1 +
 arch/x86/include/asm/msr-index.h   |  3 +++
 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c
index 6f6772f..7111400 100644
--- a/arch/x86/events/msr.c
+++ b/arch/x86/events/msr.c
@@ -7,6 +7,7 @@ enum perf_msr_id {
PERF_MSR_PPERF  = 3,
PERF_MSR_SMI= 4,
PERF_MSR_PTSC   = 5,
+   PERF_MSR_IRPERF = 6,
 
PERF_MSR_EVENT_MAX,
 };
@@ -21,6 +22,11 @@ static bool test_ptsc(int idx)
return boot_cpu_has(X86_FEATURE_PTSC);
 }
 
+static bool test_irperf(int idx)
+{
+   return boot_cpu_has(X86_FEATURE_IRPERF);
+}
+
 static bool test_intel(int idx)
 {
if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
@@ -75,20 +81,22 @@ struct perf_msr {
bool(*test)(int idx);
 };
 
-PMU_EVENT_ATTR_STRING(tsc,   evattr_tsc,   "event=0x00");
-PMU_EVENT_ATTR_STRING(aperf, evattr_aperf, "event=0x01");
-PMU_EVENT_ATTR_STRING(mperf, evattr_mperf, "event=0x02");
-PMU_EVENT_ATTR_STRING(pperf, evattr_pperf, "event=0x03");
-PMU_EVENT_ATTR_STRING(smi,   evattr_smi,   "event=0x04");
-PMU_EVENT_ATTR_STRING(ptsc,  evattr_ptsc,  "event=0x05");
+PMU_EVENT_ATTR_STRING(tsc,evattr_tsc,"event=0x00");
+PMU_EVENT_ATTR_STRING(aperf,  evattr_aperf,  "event=0x01");
+PMU_EVENT_ATTR_STRING(mperf,  evattr_mperf,  "event=0x02");
+PMU_EVENT_ATTR_STRING(pperf,  evattr_pperf,  "event=0x03");
+PMU_EVENT_ATTR_STRING(smi,evattr_smi,"event=0x04");
+PMU_EVENT_ATTR_STRING(ptsc,   evattr_ptsc,   "event=0x05");
+PMU_EVENT_ATTR_STRING(irperf, evattr_irperf, "event=0x06");
 
 static struct perf_msr msr[] = {
-   [PERF_MSR_TSC]   = { 0, _tsc,NULL,   
 },
-   [PERF_MSR_APERF] = { MSR_IA32_APERF,_aperf,  
test_aperfmperf, },
-   [PERF_MSR_MPERF] = { MSR_IA32_MPERF,_mperf,  
test_aperfmperf, },
-   [PERF_MSR_PPERF] = { MSR_PPERF, _pperf,  test_intel, 
 },
-   [PERF_MSR_SMI]   = { MSR_SMI_COUNT, _smi,test_intel, 
 },
+   [PERF_MSR_TSC]= { 0,_tsc,NULL,   
 },
+   [PERF_MSR_APERF]  = { MSR_IA32_APERF,   _aperf,  
test_aperfmperf, },
+   [PERF_MSR_MPERF]  = { MSR_IA32_MPERF,   _mperf,  
test_aperfmperf, },
+   [PERF_MSR_PPERF]  = { MSR_PPERF,_pperf,  test_intel, 
 },
+   [PERF_MSR_SMI]= { MSR_SMI_COUNT,_smi,test_intel, 
 },
[PERF_MSR_PTSC]   = { MSR_F15H_PTSC,_ptsc,   test_ptsc,  
 },
+   [PERF_MSR_IRPERF] = { MSR_F17H_IRPERF,  _irperf, test_irperf,
 },
 };
 
 static struct attribute *events_attrs[PERF_MSR_EVENT_MAX + 1] = {
diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index bdf9042..dd448a9 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -251,6 +251,7 @@
 
 /* AMD-defined CPU features, CPUID level 0x8008 (ebx), word 13 */
 #define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */
+#define X86_FEATURE_IRPERF (13*32+1) /* Instructions Retired Count */
 
 /* Thermal and Power Management Leaf, CPUID level 0x0006 (eax), word 14 */
 #define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 6e6a5cc..e0e2f7d 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -313,6 +313,9 @@
 #define MSR_AMD64_IBSOPDATA4   0xc001103d
 #define MSR_AMD64_IBS_REG_COUNT_MAX8 /* includes MSR_AM

[tip:x86/urgent] x86/cpu: Add advanced power management bits

2016-03-29 Thread tip-bot for Huang Rui

Commit-ID:  34a4cceb78e48c75d1b48b25352a3f3b2cc2b2da
Gitweb: http://git.kernel.org/tip/34a4cceb78e48c75d1b48b25352a3f3b2cc2b2da
Author: Huang Rui <ray.hu...@amd.com>
AuthorDate: Fri, 25 Mar 2016 10:08:40 +0800
Committer:  Thomas Gleixner <t...@linutronix.de>
CommitDate: Tue, 29 Mar 2016 11:12:11 +0200

x86/cpu: Add advanced power management bits

Bit 11 of CPUID 8000_0007 edx is processor feedback interface.
Bit 12 of CPUID 8000_0007 edx is accumulated power.

Print proper names in proc/cpuinfo

Reported-and-tested-by: Borislav Petkov <b...@alien8.de>
Signed-off-by: Huang Rui <ray.hu...@amd.com>
Cc: Tony Li <tony...@amd.com>
Cc: Fenghua Yu <fenghua...@intel.com>
Cc: Tony Luck <tony.l...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: "Rafael J. Wysocki" <r...@rjwysocki.net>
Cc: Andy Lutomirski <l...@amacapital.net>
Cc: Fengguang Wu <fengguang...@intel.com>
Cc: Sherry Hurwitz <sherry.hurw...@amd.com>
Cc: Borislav Petkov <b...@suse.de>
Cc: "Len Brown" <l...@kernel.org>
Link: 
http://lkml.kernel.org/r/1458871720-3209-1-git-send-email-ray.hu...@amd.com
Signed-off-by: Thomas Gleixner <t...@linutronix.de>

---
 arch/x86/kernel/cpu/powerflags.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/cpu/powerflags.c b/arch/x86/kernel/cpu/powerflags.c
index 31f0f33..1dd8294 100644
--- a/arch/x86/kernel/cpu/powerflags.c
+++ b/arch/x86/kernel/cpu/powerflags.c
@@ -18,4 +18,6 @@ const char *const x86_power_flags[32] = {
"", /* tsc invariant mapped to constant_tsc */
"cpb",  /* core performance boost */
"eff_freq_ro", /* Readonly aperf/mperf */
+   "proc_feedback", /* processor feedback interface */
+   "acc_power", /* accumulated power mechanism */
 };

[tip:x86/urgent] x86/cpu: Add advanced power management bits

2016-03-29 Thread tip-bot for Huang Rui

Commit-ID:  34a4cceb78e48c75d1b48b25352a3f3b2cc2b2da
Gitweb: http://git.kernel.org/tip/34a4cceb78e48c75d1b48b25352a3f3b2cc2b2da
Author: Huang Rui 
AuthorDate: Fri, 25 Mar 2016 10:08:40 +0800
Committer:  Thomas Gleixner 
CommitDate: Tue, 29 Mar 2016 11:12:11 +0200

x86/cpu: Add advanced power management bits

Bit 11 of CPUID 8000_0007 edx is processor feedback interface.
Bit 12 of CPUID 8000_0007 edx is accumulated power.

Print proper names in proc/cpuinfo

Reported-and-tested-by: Borislav Petkov 
Signed-off-by: Huang Rui 
Cc: Tony Li 
Cc: Fenghua Yu 
Cc: Tony Luck 
Cc: Peter Zijlstra 
Cc: "Rafael J. Wysocki" 
Cc: Andy Lutomirski 
Cc: Fengguang Wu 
Cc: Sherry Hurwitz 
Cc: Borislav Petkov 
Cc: "Len Brown" 
Link: 
http://lkml.kernel.org/r/1458871720-3209-1-git-send-email-ray.hu...@amd.com
Signed-off-by: Thomas Gleixner 

---
 arch/x86/kernel/cpu/powerflags.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/cpu/powerflags.c b/arch/x86/kernel/cpu/powerflags.c
index 31f0f33..1dd8294 100644
--- a/arch/x86/kernel/cpu/powerflags.c
+++ b/arch/x86/kernel/cpu/powerflags.c
@@ -18,4 +18,6 @@ const char *const x86_power_flags[32] = {
"", /* tsc invariant mapped to constant_tsc */
"cpb",  /* core performance boost */
"eff_freq_ro", /* Readonly aperf/mperf */
+   "proc_feedback", /* processor feedback interface */
+   "acc_power", /* accumulated power mechanism */
 };

Re: [PATCH v5 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm

2016-03-28 Thread Huang Rui

On Mon, Mar 28, 2016 at 11:33:27AM +0200, Borislav Petkov wrote:
> On Mon, Mar 28, 2016 at 01:32:14PM +0800, Huang Rui wrote:
> >  
> > return 0;
> > @@ -330,6 +446,9 @@ static int fam15h_power_init_data(struct pci_dev *f4,
> >  
> > data->max_cu_acc_power = tmp;
> >  
> > +   /* set default interval as 10 ms */
> 
> Because...?
> 

I checked with HW designer, milliseconds is also a reasonable interval of acc 
power.
And I cannot set too long here, because several seconds will cause the
read function to hang for that period of time.

So I pick 10ms here, and actually, we can update the interval at
/etc/sensors3.conf

chip "fam15h_power-*"
set power1_average_interval 0.01

Thanks,
Rui

Re: [PATCH v5 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm

2016-03-28 Thread Huang Rui

On Mon, Mar 28, 2016 at 11:33:27AM +0200, Borislav Petkov wrote:
> On Mon, Mar 28, 2016 at 01:32:14PM +0800, Huang Rui wrote:
> >  
> > return 0;
> > @@ -330,6 +446,9 @@ static int fam15h_power_init_data(struct pci_dev *f4,
> >  
> > data->max_cu_acc_power = tmp;
> >  
> > +   /* set default interval as 10 ms */
> 
> Because...?
> 

I checked with HW designer, milliseconds is also a reasonable interval of acc 
power.
And I cannot set too long here, because several seconds will cause the
read function to hang for that period of time.

So I pick 10ms here, and actually, we can update the interval at
/etc/sensors3.conf

chip "fam15h_power-*"
set power1_average_interval 0.01

Thanks,
Rui

Re: [PATCH v5 2/6] hwmon: (fam15h_power) Add compute unit accumulated power

2016-03-28 Thread Huang Rui

On Mon, Mar 28, 2016 at 11:29:52AM +0200, Borislav Petkov wrote:
> On Mon, Mar 28, 2016 at 01:32:12PM +0800, Huang Rui wrote:
> > +
> > +   get_online_cpus();
> > +   this_cpu = get_cpu();
> 
> What now?
> 
> get_online_cpus() is enough.
> 

Will remove get_cpu().

> > +
> > +   /*
> > +* Choose the first online core of each compute unit, and then
> > +* read their MSR value of power and ptsc in a single IPI,
> > +* because the MSR value of CPU core represent the compute
> > +* unit's.
> > +*/
> > +   for_each_online_cpu(cpu) {
> > +   target = cpumask_first(topology_sibling_cpumask(cpu));
> > +   if (!cpumask_test_cpu(target, mask))
> > +   cpumask_set_cpu(target, mask);
> > +   }
> 
> I think you want something like this: iterate over each core and put one
> of them into the mask.
> 
>   core = -1;
> 
>   for_each_online_cpu(cpu) {
>   this_core = topology_core_id(cpu);
> 
>   if (this_core == core)
>   continue;
> 
>   core = this_core;
> 
>   /* get any CPU on this compute unit */
>   cpumask_set_cpu(cpumask_any(topology_sibling_cpumask(cpu)), 
> mask);
>   }
> 

Yep, with new x86 topology for core on AMD, using this way should be
more clear.

Thanks,
Rui

Re: [PATCH v5 2/6] hwmon: (fam15h_power) Add compute unit accumulated power

2016-03-28 Thread Huang Rui

On Mon, Mar 28, 2016 at 11:29:52AM +0200, Borislav Petkov wrote:
> On Mon, Mar 28, 2016 at 01:32:12PM +0800, Huang Rui wrote:
> > +
> > +   get_online_cpus();
> > +   this_cpu = get_cpu();
> 
> What now?
> 
> get_online_cpus() is enough.
> 

Will remove get_cpu().

> > +
> > +   /*
> > +* Choose the first online core of each compute unit, and then
> > +* read their MSR value of power and ptsc in a single IPI,
> > +* because the MSR value of CPU core represent the compute
> > +* unit's.
> > +*/
> > +   for_each_online_cpu(cpu) {
> > +   target = cpumask_first(topology_sibling_cpumask(cpu));
> > +   if (!cpumask_test_cpu(target, mask))
> > +   cpumask_set_cpu(target, mask);
> > +   }
> 
> I think you want something like this: iterate over each core and put one
> of them into the mask.
> 
>   core = -1;
> 
>   for_each_online_cpu(cpu) {
>   this_core = topology_core_id(cpu);
> 
>   if (this_core == core)
>   continue;
> 
>   core = this_core;
> 
>   /* get any CPU on this compute unit */
>   cpumask_set_cpu(cpumask_any(topology_sibling_cpumask(cpu)), 
> mask);
>   }
> 

Yep, with new x86 topology for core on AMD, using this way should be
more clear.

Thanks,
Rui

[PATCH v5 5/6] hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm

2016-03-27 Thread Huang Rui

This patch adds the description to explain the TDP reporting mechanism
and accumulated power algorithm.

Signed-off-by: Huang Rui <ray.hu...@amd.com>
Cc: Borislav Petkov <b...@alien8.de>
---
 Documentation/hwmon/fam15h_power | 57 +++-
 drivers/hwmon/fam15h_power.c |  2 +-
 2 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/Documentation/hwmon/fam15h_power b/Documentation/hwmon/fam15h_power
index e2b1b69..2c4fbee 100644
--- a/Documentation/hwmon/fam15h_power
+++ b/Documentation/hwmon/fam15h_power
@@ -10,14 +10,22 @@ Supported chips:
   Datasheets:
   BIOS and Kernel Developer's Guide (BKDG) For AMD Family 15h Processors
   BIOS and Kernel Developer's Guide (BKDG) For AMD Family 16h Processors
+  AMD64 Architecture Programmer's Manual Volume 2: System Programming
 
 Author: Andreas Herrmann <herrmann.der.u...@googlemail.com>
 
 Description
 ---
 
+1) Processor TDP (Thermal design power)
+
+Given a fixed frequency and voltage, the power consumption of a
+processor varies based on the workload being executed. Derated power
+is the power consumed when running a specific application. Thermal
+design power (TDP) is an example of derated power.
+
 This driver permits reading of registers providing power information
-of AMD Family 15h and 16h processors.
+of AMD Family 15h and 16h processors via TDP algorithm.
 
 For AMD Family 15h and 16h processors the following power values can
 be calculated using different processor northbridge function
@@ -37,3 +45,50 @@ This driver provides ProcessorPwrWatts and CurrPwrWatts:
 On multi-node processors the calculated value is for the entire
 package and not for a single node. Thus the driver creates sysfs
 attributes only for internal node0 of a multi-node processor.
+
+2) Accumulated Power Mechanism
+
+This driver also introduces an algorithm that should be used to
+calculate the average power consumed by a processor during a
+measurement interval Tm. The feature of accumulated power mechanism is
+indicated by CPUID Fn8000_0007_EDX[12].
+
+* Tsample: compute unit power accumulator sample period
+* Tref: the PTSC counter period
+* PTSC: performance timestamp counter
+* N: the ratio of compute unit power accumulator sample period to the
+  PTSC period
+* Jmax: max compute unit accumulated power which is indicated by
+  MaxCpuSwPwrAcc MSR C001007b
+* Jx/Jy: compute unit accumulated power which is indicated by
+  CpuSwPwrAcc MSR C001007a
+* Tx/Ty: the value of performance timestamp counter which is indicated
+  by CU_PTSC MSR C0010280
+* PwrCPUave: CPU average power
+
+i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007.
+   N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]].
+
+ii. Read the full range of the cumulative energy value from the new
+MSR MaxCpuSwPwrAcc.
+   Jmax = value returned.
+iii. At time x, SW reads CpuSwPwrAcc MSR and samples the PTSC.
+   Jx = value read from CpuSwPwrAcc and Tx = value read from
+PTSC.
+
+iv. At time y, SW reads CpuSwPwrAcc MSR and samples the PTSC.
+   Jy = value read from CpuSwPwrAcc and Ty = value read from
+PTSC.
+
+v. Calculate the average power consumption for a compute unit over
+time period (y-x). Unit of result is uWatt.
+   if (Jy < Jx) // Rollover has occurred
+   Jdelta = (Jy + Jmax) - Jx
+   else
+   Jdelta = Jy - Jx
+   PwrCPUave = N * Jdelta * 1000 / (Ty - Tx)
+
+This driver provides PwrCPUave and interval(default is 10 millisecond
+and maximum is 1 second):
+* power1_average (PwrCPUave)
+* power1_average_interval (Interval)
diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 003564b..c1cad26 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -1,7 +1,7 @@
 /*
  * fam15h_power.c - AMD Family 15h processor power monitoring
  *
- * Copyright (c) 2011 Advanced Micro Devices, Inc.
+ * Copyright (c) 2011-2016 Advanced Micro Devices, Inc.
  * Author: Andreas Herrmann <herrmann.der.u...@googlemail.com>
  *
  *
-- 
1.9.1

[PATCH v5 5/6] hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm

2016-03-27 Thread Huang Rui

This patch adds the description to explain the TDP reporting mechanism
and accumulated power algorithm.

Signed-off-by: Huang Rui 
Cc: Borislav Petkov 
---
 Documentation/hwmon/fam15h_power | 57 +++-
 drivers/hwmon/fam15h_power.c |  2 +-
 2 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/Documentation/hwmon/fam15h_power b/Documentation/hwmon/fam15h_power
index e2b1b69..2c4fbee 100644
--- a/Documentation/hwmon/fam15h_power
+++ b/Documentation/hwmon/fam15h_power
@@ -10,14 +10,22 @@ Supported chips:
   Datasheets:
   BIOS and Kernel Developer's Guide (BKDG) For AMD Family 15h Processors
   BIOS and Kernel Developer's Guide (BKDG) For AMD Family 16h Processors
+  AMD64 Architecture Programmer's Manual Volume 2: System Programming
 
 Author: Andreas Herrmann 
 
 Description
 ---
 
+1) Processor TDP (Thermal design power)
+
+Given a fixed frequency and voltage, the power consumption of a
+processor varies based on the workload being executed. Derated power
+is the power consumed when running a specific application. Thermal
+design power (TDP) is an example of derated power.
+
 This driver permits reading of registers providing power information
-of AMD Family 15h and 16h processors.
+of AMD Family 15h and 16h processors via TDP algorithm.
 
 For AMD Family 15h and 16h processors the following power values can
 be calculated using different processor northbridge function
@@ -37,3 +45,50 @@ This driver provides ProcessorPwrWatts and CurrPwrWatts:
 On multi-node processors the calculated value is for the entire
 package and not for a single node. Thus the driver creates sysfs
 attributes only for internal node0 of a multi-node processor.
+
+2) Accumulated Power Mechanism
+
+This driver also introduces an algorithm that should be used to
+calculate the average power consumed by a processor during a
+measurement interval Tm. The feature of accumulated power mechanism is
+indicated by CPUID Fn8000_0007_EDX[12].
+
+* Tsample: compute unit power accumulator sample period
+* Tref: the PTSC counter period
+* PTSC: performance timestamp counter
+* N: the ratio of compute unit power accumulator sample period to the
+  PTSC period
+* Jmax: max compute unit accumulated power which is indicated by
+  MaxCpuSwPwrAcc MSR C001007b
+* Jx/Jy: compute unit accumulated power which is indicated by
+  CpuSwPwrAcc MSR C001007a
+* Tx/Ty: the value of performance timestamp counter which is indicated
+  by CU_PTSC MSR C0010280
+* PwrCPUave: CPU average power
+
+i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007.
+   N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]].
+
+ii. Read the full range of the cumulative energy value from the new
+MSR MaxCpuSwPwrAcc.
+   Jmax = value returned.
+iii. At time x, SW reads CpuSwPwrAcc MSR and samples the PTSC.
+   Jx = value read from CpuSwPwrAcc and Tx = value read from
+PTSC.
+
+iv. At time y, SW reads CpuSwPwrAcc MSR and samples the PTSC.
+   Jy = value read from CpuSwPwrAcc and Ty = value read from
+PTSC.
+
+v. Calculate the average power consumption for a compute unit over
+time period (y-x). Unit of result is uWatt.
+   if (Jy < Jx) // Rollover has occurred
+   Jdelta = (Jy + Jmax) - Jx
+   else
+   Jdelta = Jy - Jx
+   PwrCPUave = N * Jdelta * 1000 / (Ty - Tx)
+
+This driver provides PwrCPUave and interval(default is 10 millisecond
+and maximum is 1 second):
+* power1_average (PwrCPUave)
+* power1_average_interval (Interval)
diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 003564b..c1cad26 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -1,7 +1,7 @@
 /*
  * fam15h_power.c - AMD Family 15h processor power monitoring
  *
- * Copyright (c) 2011 Advanced Micro Devices, Inc.
+ * Copyright (c) 2011-2016 Advanced Micro Devices, Inc.
  * Author: Andreas Herrmann 
  *
  *
-- 
1.9.1

[PATCH v5 6/6] hwmon: (fam15h_power) Add platform check function

2016-03-27 Thread Huang Rui

This patch adds a platform check function to make code more readable.

Signed-off-by: Huang Rui <ray.hu...@amd.com>
---
 drivers/hwmon/fam15h_power.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index c1cad26..622c646 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -78,6 +78,11 @@ struct fam15h_power_data {
unsigned long power_period;
 };
 
+static bool is_carrizo_or_later(void)
+{
+   return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60;
+}
+
 static ssize_t show_power(struct device *dev,
  struct device_attribute *attr, char *buf)
 {
@@ -94,7 +99,7 @@ static ssize_t show_power(struct device *dev,
 * On Carrizo and later platforms, TdpRunAvgAccCap bit field
 * is extended to 4:31 from 4:25.
 */
-   if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60) {
+   if (is_carrizo_or_later()) {
running_avg_capture = val >> 4;
running_avg_capture = sign_extend32(running_avg_capture, 27);
} else {
@@ -111,7 +116,7 @@ static ssize_t show_power(struct device *dev,
 * On Carrizo and later platforms, ApmTdpLimit bit field
 * is extended to 16:31 from 16:28.
 */
-   if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60)
+   if (is_carrizo_or_later())
tdp_limit = val >> 16;
else
tdp_limit = (val >> 16) & 0x1fff;
-- 
1.9.1

[PATCH v5 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm

2016-03-27 Thread Huang Rui

This patch introduces an algorithm that computes the average power by
reading a delta value of “core power accumulator” register during
measurement interval, and then dividing delta value by the length of
the time interval.

User is able to use power1_average entry to measure the processor power
consumption and power1_average_interval entry to set the interval.

A simple example:

ray@hr-ub:~/tip$ sensors
fam15h_power-pci-00c4
Adapter: PCI adapter
power1:   19.58 mW (avg =   2.55 mW, interval =   0.01 s)
   (crit =  15.00 W)

...

The result is current average processor power consumption in 10
millisecond. The unit of the result is uWatt.

Suggested-by: Guenter Roeck <li...@roeck-us.net>
Signed-off-by: Huang Rui <ray.hu...@amd.com>
Cc: Borislav Petkov <b...@alien8.de>
---
 drivers/hwmon/fam15h_power.c | 119 +++
 1 file changed, 119 insertions(+)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index de6f52b..003564b 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -27,6 +27,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
@@ -48,6 +50,9 @@ MODULE_LICENSE("GPL");
 #define FAM15H_NUM_GROUPS  2
 #define MAX_CUS8
 
+/* set maximum interval as 1 second */
+#define MAX_INTERVAL   1000
+
 #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
 #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b
 #define MSR_F15H_PTSC  0xc0010280
@@ -68,6 +73,9 @@ struct fam15h_power_data {
u64 cu_acc_power[MAX_CUS];
/* performance timestamp counter */
u64 cpu_sw_pwr_ptsc[MAX_CUS];
+   /* online/offline status of current compute unit */
+   int cu_on[MAX_CUS];
+   unsigned long power_period;
 };
 
 static ssize_t show_power(struct device *dev,
@@ -149,6 +157,8 @@ static void do_read_registers_on_cu(void *_data)
 
rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]);
+
+   data->cu_on[cu] = 1;
 }
 
 /*
@@ -165,6 +175,8 @@ static int read_registers(struct fam15h_power_data *data)
if (!ret)
return -ENOMEM;
 
+   memset(data->cu_on, 0, sizeof(int) * MAX_CUS);
+
get_online_cpus();
this_cpu = get_cpu();
 
@@ -192,18 +204,117 @@ static int read_registers(struct fam15h_power_data *data)
return 0;
 }
 
+static ssize_t acc_show_power(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+   struct fam15h_power_data *data = dev_get_drvdata(dev);
+   u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS],
+   jdelta[MAX_CUS];
+   u64 tdelta, avg_acc;
+   int cu, cu_num, ret;
+   signed long leftover;
+
+   /*
+* With the new x86 topology modelling, x86_max_cores is the
+* compute unit number.
+*/
+   cu_num = boot_cpu_data.x86_max_cores;
+
+   ret = read_registers(data);
+   if (ret)
+   return 0;
+
+   for (cu = 0; cu < cu_num; cu++) {
+   prev_cu_acc_power[cu] = data->cu_acc_power[cu];
+   prev_ptsc[cu] = data->cpu_sw_pwr_ptsc[cu];
+   }
+
+   leftover = 
schedule_timeout_interruptible(msecs_to_jiffies(data->power_period));
+   if (leftover)
+   return 0;
+
+   ret = read_registers(data);
+   if (ret)
+   return 0;
+
+   for (cu = 0, avg_acc = 0; cu < cu_num; cu++) {
+   /* check if current compute unit is online */
+   if (data->cu_on[cu] == 0)
+   continue;
+
+   if (data->cu_acc_power[cu] < prev_cu_acc_power[cu]) {
+   jdelta[cu] = data->max_cu_acc_power + 
data->cu_acc_power[cu];
+   jdelta[cu] -= prev_cu_acc_power[cu];
+   } else {
+   jdelta[cu] = data->cu_acc_power[cu] - 
prev_cu_acc_power[cu];
+   }
+   tdelta = data->cpu_sw_pwr_ptsc[cu] - prev_ptsc[cu];
+   jdelta[cu] *= data->cpu_pwr_sample_ratio * 1000;
+   do_div(jdelta[cu], tdelta);
+
+   /* the unit is microWatt */
+   avg_acc += jdelta[cu];
+   }
+
+   return sprintf(buf, "%llu\n", (unsigned long long)avg_acc);
+}
+static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL);
+
+static ssize_t acc_show_power_period(struct device *dev,
+struct device_attribute *attr,
+char *buf)
+{
+   struct fam15h_power_data *data = dev_get_drvdata(dev);
+
+   return sprintf(buf, "%lu\n", data->power_period);
+}
+
+static ssize_t acc_set_power_period(struct device *dev,
+

[PATCH v5 6/6] hwmon: (fam15h_power) Add platform check function

2016-03-27 Thread Huang Rui

This patch adds a platform check function to make code more readable.

Signed-off-by: Huang Rui 
---
 drivers/hwmon/fam15h_power.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index c1cad26..622c646 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -78,6 +78,11 @@ struct fam15h_power_data {
unsigned long power_period;
 };
 
+static bool is_carrizo_or_later(void)
+{
+   return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60;
+}
+
 static ssize_t show_power(struct device *dev,
  struct device_attribute *attr, char *buf)
 {
@@ -94,7 +99,7 @@ static ssize_t show_power(struct device *dev,
 * On Carrizo and later platforms, TdpRunAvgAccCap bit field
 * is extended to 4:31 from 4:25.
 */
-   if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60) {
+   if (is_carrizo_or_later()) {
running_avg_capture = val >> 4;
running_avg_capture = sign_extend32(running_avg_capture, 27);
} else {
@@ -111,7 +116,7 @@ static ssize_t show_power(struct device *dev,
 * On Carrizo and later platforms, ApmTdpLimit bit field
 * is extended to 16:31 from 16:28.
 */
-   if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60)
+   if (is_carrizo_or_later())
tdp_limit = val >> 16;
else
tdp_limit = (val >> 16) & 0x1fff;
-- 
1.9.1

[PATCH v5 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm

2016-03-27 Thread Huang Rui

This patch introduces an algorithm that computes the average power by
reading a delta value of “core power accumulator” register during
measurement interval, and then dividing delta value by the length of
the time interval.

User is able to use power1_average entry to measure the processor power
consumption and power1_average_interval entry to set the interval.

A simple example:

ray@hr-ub:~/tip$ sensors
fam15h_power-pci-00c4
Adapter: PCI adapter
power1:   19.58 mW (avg =   2.55 mW, interval =   0.01 s)
   (crit =  15.00 W)

...

The result is current average processor power consumption in 10
millisecond. The unit of the result is uWatt.

Suggested-by: Guenter Roeck 
Signed-off-by: Huang Rui 
Cc: Borislav Petkov 
---
 drivers/hwmon/fam15h_power.c | 119 +++
 1 file changed, 119 insertions(+)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index de6f52b..003564b 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -27,6 +27,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
@@ -48,6 +50,9 @@ MODULE_LICENSE("GPL");
 #define FAM15H_NUM_GROUPS  2
 #define MAX_CUS8
 
+/* set maximum interval as 1 second */
+#define MAX_INTERVAL   1000
+
 #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
 #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b
 #define MSR_F15H_PTSC  0xc0010280
@@ -68,6 +73,9 @@ struct fam15h_power_data {
u64 cu_acc_power[MAX_CUS];
/* performance timestamp counter */
u64 cpu_sw_pwr_ptsc[MAX_CUS];
+   /* online/offline status of current compute unit */
+   int cu_on[MAX_CUS];
+   unsigned long power_period;
 };
 
 static ssize_t show_power(struct device *dev,
@@ -149,6 +157,8 @@ static void do_read_registers_on_cu(void *_data)
 
rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]);
+
+   data->cu_on[cu] = 1;
 }
 
 /*
@@ -165,6 +175,8 @@ static int read_registers(struct fam15h_power_data *data)
if (!ret)
return -ENOMEM;
 
+   memset(data->cu_on, 0, sizeof(int) * MAX_CUS);
+
get_online_cpus();
this_cpu = get_cpu();
 
@@ -192,18 +204,117 @@ static int read_registers(struct fam15h_power_data *data)
return 0;
 }
 
+static ssize_t acc_show_power(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+   struct fam15h_power_data *data = dev_get_drvdata(dev);
+   u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS],
+   jdelta[MAX_CUS];
+   u64 tdelta, avg_acc;
+   int cu, cu_num, ret;
+   signed long leftover;
+
+   /*
+* With the new x86 topology modelling, x86_max_cores is the
+* compute unit number.
+*/
+   cu_num = boot_cpu_data.x86_max_cores;
+
+   ret = read_registers(data);
+   if (ret)
+   return 0;
+
+   for (cu = 0; cu < cu_num; cu++) {
+   prev_cu_acc_power[cu] = data->cu_acc_power[cu];
+   prev_ptsc[cu] = data->cpu_sw_pwr_ptsc[cu];
+   }
+
+   leftover = 
schedule_timeout_interruptible(msecs_to_jiffies(data->power_period));
+   if (leftover)
+   return 0;
+
+   ret = read_registers(data);
+   if (ret)
+   return 0;
+
+   for (cu = 0, avg_acc = 0; cu < cu_num; cu++) {
+   /* check if current compute unit is online */
+   if (data->cu_on[cu] == 0)
+   continue;
+
+   if (data->cu_acc_power[cu] < prev_cu_acc_power[cu]) {
+   jdelta[cu] = data->max_cu_acc_power + 
data->cu_acc_power[cu];
+   jdelta[cu] -= prev_cu_acc_power[cu];
+   } else {
+   jdelta[cu] = data->cu_acc_power[cu] - 
prev_cu_acc_power[cu];
+   }
+   tdelta = data->cpu_sw_pwr_ptsc[cu] - prev_ptsc[cu];
+   jdelta[cu] *= data->cpu_pwr_sample_ratio * 1000;
+   do_div(jdelta[cu], tdelta);
+
+   /* the unit is microWatt */
+   avg_acc += jdelta[cu];
+   }
+
+   return sprintf(buf, "%llu\n", (unsigned long long)avg_acc);
+}
+static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL);
+
+static ssize_t acc_show_power_period(struct device *dev,
+struct device_attribute *attr,
+char *buf)
+{
+   struct fam15h_power_data *data = dev_get_drvdata(dev);
+
+   return sprintf(buf, "%lu\n", data->power_period);
+}
+
+static ssize_t acc_set_power_period(struct device *dev,
+   struct device_attribute *attr,
+

[PATCH v5 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm

2016-03-27 Thread Huang Rui

Hi Guenter,

This serial of patches introduces an accumulated power reporting
algorithm. It will calculate the average power consumption for the
processor. The cpu feature flag is CPUID.8000_0007H:EDX[12].

This algorithm is used to test the comparison of processor power
consumption with between MWAITX delay and TSC delay on AMD Carrizo
platforms.

Reference:
https://lkml.kernel.org/r/1438744732-1459-1-git-send-email-ray.hu...@amd.com

Commit f96756 at tip ("x86/asm: Add MONITORX/MWAITX instruction support")
Commit b466bd at tip ("x86/asm/delay: Introduce an MWAITX-based delay with a 
configurable timer")

V1: 
https://lkml.kernel.org/r/1440662866-28716-1-git-send-email-ray.hu...@amd.com
V2: 
https://lkml.kernel.org/r/1445308109-17970-1-git-send-email-ray.hu...@amd.com
V3: https://lkml.kernel.org/r/1446199024-1896-1-git-send-email-ray.hu...@amd.com
V4: https://lkml.kernel.org/r/1457662670-3354-1-git-send-email-ray.hu...@amd.com

On V5, I used the new x86 topology with compute unit id (cpu_core_id)
and compute unit numbers (x86_max_cores) for AMD processors. The
change and documentation will go to master soon as below link. And
This way is better to use smp_num_siblings which is un-defined
variable out of CONFIG_SMP.

Refer:
https://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/commit/?h=tip-urgent=f6ce1851fa2eec2c332255fc25a544658dbfbfe4

Changes from v1 -> v2:
- Move fam15h_power_groups and fam15h_power_group into fam15h_power_data to
  avoid overwrite on multi-CPU system.
- Rename FAM15H_MIN_NUM_ATTRS macro and fix return error code.
- Remove unnecessary warning print.
- Adds do_read_registers_on_cu to do all the read to all MSRs and run it on one
  of the online cores on each compute unit with smp_call_function_many().
- Use power1_average and power1_average_interval standard entry
  instread of power1_acc
- Fix the CPU-hotplug case.

Changes from v2 -> v3:
- As Guenter's suggestion, remove typecast, use >groups[0].
- Remove all "fam15_power_*" prefix at data.
- Remove unnecessary ( ).
- Fix the issue that is reported by build test robot, and add
  CPU_SUP_AMD as the dependence of fam15h_power
- Remove the WARN_ON at do_read_registers_on_cu, because it must be
  behind CPUID check. The MSR must be available since
  CPUID.8000_0007H:EDX[12] is set 
- Add get_online_cpus()/put_online_cpus() functions.
- Refine commments and the method which generate cpumask for cu.
- Add the interval scope to make the value suitable for user
  experience
- Remove the useless mutex.

Changes from v3 -> v4:
- Rebase the patches to latest groeck/hwmon-next.
- Use smp_num_siblings instead of cores_per_cu accessor.
- Refine the cpumask method which is inspired by perf solution.
- Fix some typo and errors.

Changes from v4 -> v5:
- Rebase patches to v4.6-rc1.
- Use new x86 topology with compute id (cpu_core_id) and compute unit
  number (x86_max_cores) instead of smp_num_siblings.

A simple example:

ray@hr-ub:~/tip$ sensors
fam15h_power-pci-00c4
Adapter: PCI adapter
power1:   19.58 mW (avg =   2.55 mW, interval =   0.01 s)
   (crit =  15.00 W)

...

These patches are rebased on v4.6-rc1.

Thanks,
Rui

Huang Rui (6):
  hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence
  hwmon: (fam15h_power) Add compute unit accumulated power
  hwmon: (fam15h_power) Add ptsc counter value for accumulated power
  hwmon: (fam15h_power) Introduce a cpu accumulated power reporting
algorithm
  hwmon: (fam15h_power) Add documentation for TDP and accumulated power
algorithm
  hwmon: (fam15h_power) Add platform check function

 Documentation/hwmon/fam15h_power |  57 ++-
 drivers/hwmon/Kconfig|   2 +-
 drivers/hwmon/fam15h_power.c | 199 ++-
 3 files changed, 252 insertions(+), 6 deletions(-)

-- 
1.9.1

[PATCH v5 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm

2016-03-27 Thread Huang Rui

Hi Guenter,

This serial of patches introduces an accumulated power reporting
algorithm. It will calculate the average power consumption for the
processor. The cpu feature flag is CPUID.8000_0007H:EDX[12].

This algorithm is used to test the comparison of processor power
consumption with between MWAITX delay and TSC delay on AMD Carrizo
platforms.

Reference:
https://lkml.kernel.org/r/1438744732-1459-1-git-send-email-ray.hu...@amd.com

Commit f96756 at tip ("x86/asm: Add MONITORX/MWAITX instruction support")
Commit b466bd at tip ("x86/asm/delay: Introduce an MWAITX-based delay with a 
configurable timer")

V1: 
https://lkml.kernel.org/r/1440662866-28716-1-git-send-email-ray.hu...@amd.com
V2: 
https://lkml.kernel.org/r/1445308109-17970-1-git-send-email-ray.hu...@amd.com
V3: https://lkml.kernel.org/r/1446199024-1896-1-git-send-email-ray.hu...@amd.com
V4: https://lkml.kernel.org/r/1457662670-3354-1-git-send-email-ray.hu...@amd.com

On V5, I used the new x86 topology with compute unit id (cpu_core_id)
and compute unit numbers (x86_max_cores) for AMD processors. The
change and documentation will go to master soon as below link. And
This way is better to use smp_num_siblings which is un-defined
variable out of CONFIG_SMP.

Refer:
https://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/commit/?h=tip-urgent=f6ce1851fa2eec2c332255fc25a544658dbfbfe4

Changes from v1 -> v2:
- Move fam15h_power_groups and fam15h_power_group into fam15h_power_data to
  avoid overwrite on multi-CPU system.
- Rename FAM15H_MIN_NUM_ATTRS macro and fix return error code.
- Remove unnecessary warning print.
- Adds do_read_registers_on_cu to do all the read to all MSRs and run it on one
  of the online cores on each compute unit with smp_call_function_many().
- Use power1_average and power1_average_interval standard entry
  instread of power1_acc
- Fix the CPU-hotplug case.

Changes from v2 -> v3:
- As Guenter's suggestion, remove typecast, use >groups[0].
- Remove all "fam15_power_*" prefix at data.
- Remove unnecessary ( ).
- Fix the issue that is reported by build test robot, and add
  CPU_SUP_AMD as the dependence of fam15h_power
- Remove the WARN_ON at do_read_registers_on_cu, because it must be
  behind CPUID check. The MSR must be available since
  CPUID.8000_0007H:EDX[12] is set 
- Add get_online_cpus()/put_online_cpus() functions.
- Refine commments and the method which generate cpumask for cu.
- Add the interval scope to make the value suitable for user
  experience
- Remove the useless mutex.

Changes from v3 -> v4:
- Rebase the patches to latest groeck/hwmon-next.
- Use smp_num_siblings instead of cores_per_cu accessor.
- Refine the cpumask method which is inspired by perf solution.
- Fix some typo and errors.

Changes from v4 -> v5:
- Rebase patches to v4.6-rc1.
- Use new x86 topology with compute id (cpu_core_id) and compute unit
  number (x86_max_cores) instead of smp_num_siblings.

A simple example:

ray@hr-ub:~/tip$ sensors
fam15h_power-pci-00c4
Adapter: PCI adapter
power1:   19.58 mW (avg =   2.55 mW, interval =   0.01 s)
   (crit =  15.00 W)

...

These patches are rebased on v4.6-rc1.

Thanks,
Rui

Huang Rui (6):
  hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence
  hwmon: (fam15h_power) Add compute unit accumulated power
  hwmon: (fam15h_power) Add ptsc counter value for accumulated power
  hwmon: (fam15h_power) Introduce a cpu accumulated power reporting
algorithm
  hwmon: (fam15h_power) Add documentation for TDP and accumulated power
algorithm
  hwmon: (fam15h_power) Add platform check function

 Documentation/hwmon/fam15h_power |  57 ++-
 drivers/hwmon/Kconfig|   2 +-
 drivers/hwmon/fam15h_power.c | 199 ++-
 3 files changed, 252 insertions(+), 6 deletions(-)

-- 
1.9.1

[PATCH v5 1/6] hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence

2016-03-27 Thread Huang Rui

This patch adds CONFIG_CPU_SUP_AMD as the dependence of fam15h_power
driver. Because the following patch will use the interface from
x86/kernel/cpu/amd.c.

Otherwise, the below error might be encountered:

All errors (new ones prefixed by >>):

   drivers/built-in.o: In function `fam15h_power_probe':
>> fam15h_power.c:(.text+0x26e3a3): undefined reference to
>> `amd_get_cores_per_cu'
   fam15h_power.c:(.text+0x26e41e): undefined reference to
`amd_get_cores_per_cu'

Reported-by: build test robot <l...@intel.com>
Signed-off-by: Huang Rui <ray.hu...@amd.com>
---
 drivers/hwmon/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index 5c2d13a..4be3792 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -288,7 +288,7 @@ config SENSORS_K10TEMP
 
 config SENSORS_FAM15H_POWER
tristate "AMD Family 15h processor power"
-   depends on X86 && PCI
+   depends on X86 && PCI && CPU_SUP_AMD
help
  If you say yes here you get support for processor power
  information of your AMD family 15h CPU.
-- 
1.9.1

[PATCH v5 3/6] hwmon: (fam15h_power) Add ptsc counter value for accumulated power

2016-03-27 Thread Huang Rui

PTSC is the performance timestamp counter value in a cpu core and the
cores in one compute unit have the fixed frequency. So it picks up the
performance timestamp counter value of the first core per compute unit
to measure the interval for average power per compute unit.

Signed-off-by: Huang Rui <ray.hu...@amd.com>
Cc: Borislav Petkov <b...@alien8.de>
---
 drivers/hwmon/fam15h_power.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index ccbc944..de6f52b 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -50,6 +50,7 @@ MODULE_LICENSE("GPL");
 
 #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
 #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b
+#define MSR_F15H_PTSC  0xc0010280
 
 #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4
 
@@ -65,6 +66,8 @@ struct fam15h_power_data {
u64 max_cu_acc_power;
/* accumulated power of the compute units */
u64 cu_acc_power[MAX_CUS];
+   /* performance timestamp counter */
+   u64 cpu_sw_pwr_ptsc[MAX_CUS];
 };
 
 static ssize_t show_power(struct device *dev,
@@ -145,6 +148,7 @@ static void do_read_registers_on_cu(void *_data)
cu = cpu_data(cpu).cpu_core_id;
 
rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
+   rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]);
 }
 
 /*
-- 
1.9.1

[PATCH v5 1/6] hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence

2016-03-27 Thread Huang Rui

This patch adds CONFIG_CPU_SUP_AMD as the dependence of fam15h_power
driver. Because the following patch will use the interface from
x86/kernel/cpu/amd.c.

Otherwise, the below error might be encountered:

All errors (new ones prefixed by >>):

   drivers/built-in.o: In function `fam15h_power_probe':
>> fam15h_power.c:(.text+0x26e3a3): undefined reference to
>> `amd_get_cores_per_cu'
   fam15h_power.c:(.text+0x26e41e): undefined reference to
`amd_get_cores_per_cu'

Reported-by: build test robot 
Signed-off-by: Huang Rui 
---
 drivers/hwmon/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index 5c2d13a..4be3792 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -288,7 +288,7 @@ config SENSORS_K10TEMP
 
 config SENSORS_FAM15H_POWER
tristate "AMD Family 15h processor power"
-   depends on X86 && PCI
+   depends on X86 && PCI && CPU_SUP_AMD
help
  If you say yes here you get support for processor power
  information of your AMD family 15h CPU.
-- 
1.9.1

[PATCH v5 3/6] hwmon: (fam15h_power) Add ptsc counter value for accumulated power

2016-03-27 Thread Huang Rui

PTSC is the performance timestamp counter value in a cpu core and the
cores in one compute unit have the fixed frequency. So it picks up the
performance timestamp counter value of the first core per compute unit
to measure the interval for average power per compute unit.

Signed-off-by: Huang Rui 
Cc: Borislav Petkov 
---
 drivers/hwmon/fam15h_power.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index ccbc944..de6f52b 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -50,6 +50,7 @@ MODULE_LICENSE("GPL");
 
 #define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
 #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b
+#define MSR_F15H_PTSC  0xc0010280
 
 #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4
 
@@ -65,6 +66,8 @@ struct fam15h_power_data {
u64 max_cu_acc_power;
/* accumulated power of the compute units */
u64 cu_acc_power[MAX_CUS];
+   /* performance timestamp counter */
+   u64 cpu_sw_pwr_ptsc[MAX_CUS];
 };
 
 static ssize_t show_power(struct device *dev,
@@ -145,6 +148,7 @@ static void do_read_registers_on_cu(void *_data)
cu = cpu_data(cpu).cpu_core_id;
 
rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
+   rdmsrl_safe(MSR_F15H_PTSC, >cpu_sw_pwr_ptsc[cu]);
 }
 
 /*
-- 
1.9.1

[PATCH v5 2/6] hwmon: (fam15h_power) Add compute unit accumulated power

2016-03-27 Thread Huang Rui

This patch adds a member in fam15h_power_data which specifies the
compute unit accumulated power. It adds do_read_registers_on_cu to do
all the read to all MSRs and run it on one of the online cores on each
compute unit with smp_call_function_many(). This behavior can decrease
IPI numbers.

Suggested-by: Borislav Petkov <b...@alien8.de>
Signed-off-by: Huang Rui <ray.hu...@amd.com>
---
 drivers/hwmon/fam15h_power.c | 65 +++-
 1 file changed, 64 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 4f695d8..ccbc944 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -25,6 +25,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
@@ -44,7 +46,9 @@ MODULE_LICENSE("GPL");
 
 #define FAM15H_MIN_NUM_ATTRS   2
 #define FAM15H_NUM_GROUPS  2
+#define MAX_CUS8
 
+#define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
 #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b
 
 #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4
@@ -59,6 +63,8 @@ struct fam15h_power_data {
struct attribute_group group;
/* maximum accumulated power of a compute unit */
u64 max_cu_acc_power;
+   /* accumulated power of the compute units */
+   u64 cu_acc_power[MAX_CUS];
 };
 
 static ssize_t show_power(struct device *dev,
@@ -125,6 +131,63 @@ static ssize_t show_power_crit(struct device *dev,
 }
 static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL);
 
+static void do_read_registers_on_cu(void *_data)
+{
+   struct fam15h_power_data *data = _data;
+   int cpu, cu;
+
+   cpu = smp_processor_id();
+
+   /*
+* With the new x86 topology modelling, cpu core id actually
+* is compute unit id.
+*/
+   cu = cpu_data(cpu).cpu_core_id;
+
+   rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
+}
+
+/*
+ * This function is only able to be called when CPUID
+ * Fn8000_0007:EDX[12] is set.
+ */
+static int read_registers(struct fam15h_power_data *data)
+{
+   int this_cpu, ret, cpu;
+   int target;
+   cpumask_var_t mask;
+
+   ret = zalloc_cpumask_var(, GFP_KERNEL);
+   if (!ret)
+   return -ENOMEM;
+
+   get_online_cpus();
+   this_cpu = get_cpu();
+
+   /*
+* Choose the first online core of each compute unit, and then
+* read their MSR value of power and ptsc in a single IPI,
+* because the MSR value of CPU core represent the compute
+* unit's.
+*/
+   for_each_online_cpu(cpu) {
+   target = cpumask_first(topology_sibling_cpumask(cpu));
+   if (!cpumask_test_cpu(target, mask))
+   cpumask_set_cpu(target, mask);
+   }
+
+   if (cpumask_test_cpu(this_cpu, mask))
+   do_read_registers_on_cu(data);
+
+   smp_call_function_many(mask, do_read_registers_on_cu, data, true);
+   put_cpu();
+   put_online_cpus();
+
+   free_cpumask_var(mask);
+
+   return 0;
+}
+
 static int fam15h_power_init_attrs(struct pci_dev *pdev,
   struct fam15h_power_data *data)
 {
@@ -263,7 +326,7 @@ static int fam15h_power_init_data(struct pci_dev *f4,
 
data->max_cu_acc_power = tmp;
 
-   return 0;
+   return read_registers(data);
 }
 
 static int fam15h_power_probe(struct pci_dev *pdev,
-- 
1.9.1

[PATCH v5 2/6] hwmon: (fam15h_power) Add compute unit accumulated power

2016-03-27 Thread Huang Rui

This patch adds a member in fam15h_power_data which specifies the
compute unit accumulated power. It adds do_read_registers_on_cu to do
all the read to all MSRs and run it on one of the online cores on each
compute unit with smp_call_function_many(). This behavior can decrease
IPI numbers.

Suggested-by: Borislav Petkov 
Signed-off-by: Huang Rui 
---
 drivers/hwmon/fam15h_power.c | 65 +++-
 1 file changed, 64 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 4f695d8..ccbc944 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -25,6 +25,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
@@ -44,7 +46,9 @@ MODULE_LICENSE("GPL");
 
 #define FAM15H_MIN_NUM_ATTRS   2
 #define FAM15H_NUM_GROUPS  2
+#define MAX_CUS8
 
+#define MSR_F15H_CU_PWR_ACCUMULATOR0xc001007a
 #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR0xc001007b
 
 #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4
@@ -59,6 +63,8 @@ struct fam15h_power_data {
struct attribute_group group;
/* maximum accumulated power of a compute unit */
u64 max_cu_acc_power;
+   /* accumulated power of the compute units */
+   u64 cu_acc_power[MAX_CUS];
 };
 
 static ssize_t show_power(struct device *dev,
@@ -125,6 +131,63 @@ static ssize_t show_power_crit(struct device *dev,
 }
 static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL);
 
+static void do_read_registers_on_cu(void *_data)
+{
+   struct fam15h_power_data *data = _data;
+   int cpu, cu;
+
+   cpu = smp_processor_id();
+
+   /*
+* With the new x86 topology modelling, cpu core id actually
+* is compute unit id.
+*/
+   cu = cpu_data(cpu).cpu_core_id;
+
+   rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, >cu_acc_power[cu]);
+}
+
+/*
+ * This function is only able to be called when CPUID
+ * Fn8000_0007:EDX[12] is set.
+ */
+static int read_registers(struct fam15h_power_data *data)
+{
+   int this_cpu, ret, cpu;
+   int target;
+   cpumask_var_t mask;
+
+   ret = zalloc_cpumask_var(, GFP_KERNEL);
+   if (!ret)
+   return -ENOMEM;
+
+   get_online_cpus();
+   this_cpu = get_cpu();
+
+   /*
+* Choose the first online core of each compute unit, and then
+* read their MSR value of power and ptsc in a single IPI,
+* because the MSR value of CPU core represent the compute
+* unit's.
+*/
+   for_each_online_cpu(cpu) {
+   target = cpumask_first(topology_sibling_cpumask(cpu));
+   if (!cpumask_test_cpu(target, mask))
+   cpumask_set_cpu(target, mask);
+   }
+
+   if (cpumask_test_cpu(this_cpu, mask))
+   do_read_registers_on_cu(data);
+
+   smp_call_function_many(mask, do_read_registers_on_cu, data, true);
+   put_cpu();
+   put_online_cpus();
+
+   free_cpumask_var(mask);
+
+   return 0;
+}
+
 static int fam15h_power_init_attrs(struct pci_dev *pdev,
   struct fam15h_power_data *data)
 {
@@ -263,7 +326,7 @@ static int fam15h_power_init_data(struct pci_dev *f4,
 
data->max_cu_acc_power = tmp;
 
-   return 0;
+   return read_registers(data);
 }
 
 static int fam15h_power_probe(struct pci_dev *pdev,
-- 
1.9.1

[tip:perf/urgent] perf/x86: Move events_sysfs_show() outside CPU_SUP_INTEL

2016-03-25 Thread tip-bot for Huang Rui

Commit-ID:  a49ac9f83b31e41c8311d64bd2b3f97a23dcb38d
Gitweb: http://git.kernel.org/tip/a49ac9f83b31e41c8311d64bd2b3f97a23dcb38d
Author: Huang Rui <ray.hu...@amd.com>
AuthorDate: Fri, 25 Mar 2016 11:18:25 +0800
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Fri, 25 Mar 2016 09:46:53 +0100

perf/x86: Move events_sysfs_show() outside CPU_SUP_INTEL

randconfig builds can sometimes disable CONFIG_CPU_SUP_INTEL while
enabling the AMD power reporting PMU driver, resulting in this
build failure:

  arch/x86/kernel/cpu/perf_event.h:663:31: error: 'events_sysfs_show' 
undeclared here (not in a function)

To fix it, move events_sysfs_show() outside of #ifdef CONFIG_CPU_SUP_INTEL.

Reported-by: Randy Dunlap <rdun...@infradead.org>
Reported-by: build test robot <l...@intel.com>
Signed-off-by: Huang Rui <ray.hu...@amd.com>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Borislav Petkov <b...@suse.de>
Cc: Fengguang Wu <fengguang...@intel.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Sherry Hurwitz <sherry.hurw...@amd.com>
Cc: Stephen Rothwell <s...@canb.auug.org.au>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: a...@kernel.org
Cc: kbuild-...@01.org
Cc: linux-n...@vger.kernel.org
Cc: spg_linux_ker...@amd.com
Link: 
http://lkml.kernel.org/r/1458875905-4278-1-git-send-email-ray.hu...@amd.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/events/perf_event.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index ba6ef18..a6771e2 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -795,6 +795,9 @@ ssize_t intel_event_sysfs_show(char *page, u64 config);
 
 struct attribute **merge_attr(struct attribute **a, struct attribute **b);
 
+ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr,
+ char *page);
+
 #ifdef CONFIG_CPU_SUP_AMD
 
 int amd_pmu_init(void);
@@ -925,9 +928,6 @@ int p6_pmu_init(void);
 
 int knc_pmu_init(void);
 
-ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr,
- char *page);
-
 static inline int is_ht_workaround_enabled(void)
 {
return !!(x86_pmu.flags & PMU_FL_EXCL_ENABLED);

[tip:perf/urgent] perf/x86: Move events_sysfs_show() outside CPU_SUP_INTEL

2016-03-25 Thread tip-bot for Huang Rui

Commit-ID:  a49ac9f83b31e41c8311d64bd2b3f97a23dcb38d
Gitweb: http://git.kernel.org/tip/a49ac9f83b31e41c8311d64bd2b3f97a23dcb38d
Author: Huang Rui 
AuthorDate: Fri, 25 Mar 2016 11:18:25 +0800
Committer:  Ingo Molnar 
CommitDate: Fri, 25 Mar 2016 09:46:53 +0100

perf/x86: Move events_sysfs_show() outside CPU_SUP_INTEL

randconfig builds can sometimes disable CONFIG_CPU_SUP_INTEL while
enabling the AMD power reporting PMU driver, resulting in this
build failure:

  arch/x86/kernel/cpu/perf_event.h:663:31: error: 'events_sysfs_show' 
undeclared here (not in a function)

To fix it, move events_sysfs_show() outside of #ifdef CONFIG_CPU_SUP_INTEL.

Reported-by: Randy Dunlap 
Reported-by: build test robot 
Signed-off-by: Huang Rui 
Cc: Borislav Petkov 
Cc: Borislav Petkov 
Cc: Fengguang Wu 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Sherry Hurwitz 
Cc: Stephen Rothwell 
Cc: Thomas Gleixner 
Cc: a...@kernel.org
Cc: kbuild-...@01.org
Cc: linux-n...@vger.kernel.org
Cc: spg_linux_ker...@amd.com
Link: 
http://lkml.kernel.org/r/1458875905-4278-1-git-send-email-ray.hu...@amd.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/events/perf_event.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index ba6ef18..a6771e2 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -795,6 +795,9 @@ ssize_t intel_event_sysfs_show(char *page, u64 config);
 
 struct attribute **merge_attr(struct attribute **a, struct attribute **b);
 
+ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr,
+ char *page);
+
 #ifdef CONFIG_CPU_SUP_AMD
 
 int amd_pmu_init(void);
@@ -925,9 +928,6 @@ int p6_pmu_init(void);
 
 int knc_pmu_init(void);
 
-ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr,
- char *page);
-
 static inline int is_ht_workaround_enabled(void)
 {
return !!(x86_pmu.flags & PMU_FL_EXCL_ENABLED);

[REDO PATCH] x86/perf: Move events_sysfs_show() outside CPU_SUP_INTEL

2016-03-24 Thread Huang Rui

This patch moves events_sysfs_show outside CONFIG_CPU_SUP_INTEL, because
this interface will be also used in an AMD power reporting PMU driver.

Otherwise, below build error would be encountered:

All error/warnings (new ones prefixed by >>):

 In file included from include/linux/kobject.h:21:0,
  from include/linux/module.h:17,
  from arch/x86/kernel/cpu/perf_event_amd_power.c:13:
  >> arch/x86/kernel/cpu/perf_event.h:663:31: error: 'events_sysfs_show' 
undeclared here (not in a function)
   .attr  = __ATTR(_name, 0444, events_sysfs_show, NULL), \
^
 include/linux/sysfs.h:103:10: note: in definition of macro '__ATTR'
   .show = _show,  \
   ^
  >> arch/x86/kernel/cpu/perf_event_amd_power.c:244:1: note: in expansion of 
macro 'EVENT_ATTR_STR'
  EVENT_ATTR_STR(power-pkg, power_pkg, "event=0x01");
  ^

Reported-by: Randy Dunlap <rdun...@infradead.org>
Reported-by: build test robot <l...@intel.com>
Signed-off-by: Huang Rui <ray.hu...@amd.com>
Cc: Stephen Rothwell <s...@canb.auug.org.au>
Cc: a...@kernel.org
---

Hi Peter, Ingo, Randy,

According to the comments of below thread, I rebased this fix to
latest kernel.

http://lkml.kernel.org/r/56f41e3f.8000...@infradead.org

Orignal patch:
https://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/commit/?h=tip-perf=f5ba42d22cb478d32363b2b8e92e14b1fd190ce1

Thanks,
Rui

---
 arch/x86/events/perf_event.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index ba6ef18..a6771e2 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -795,6 +795,9 @@ ssize_t intel_event_sysfs_show(char *page, u64 config);
 
 struct attribute **merge_attr(struct attribute **a, struct attribute **b);
 
+ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr,
+ char *page);
+
 #ifdef CONFIG_CPU_SUP_AMD
 
 int amd_pmu_init(void);
@@ -925,9 +928,6 @@ int p6_pmu_init(void);
 
 int knc_pmu_init(void);
 
-ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr,
- char *page);
-
 static inline int is_ht_workaround_enabled(void)
 {
return !!(x86_pmu.flags & PMU_FL_EXCL_ENABLED);
-- 
1.9.1

[REDO PATCH] x86/perf: Move events_sysfs_show() outside CPU_SUP_INTEL

2016-03-24 Thread Huang Rui

This patch moves events_sysfs_show outside CONFIG_CPU_SUP_INTEL, because
this interface will be also used in an AMD power reporting PMU driver.

Otherwise, below build error would be encountered:

All error/warnings (new ones prefixed by >>):

 In file included from include/linux/kobject.h:21:0,
  from include/linux/module.h:17,
  from arch/x86/kernel/cpu/perf_event_amd_power.c:13:
  >> arch/x86/kernel/cpu/perf_event.h:663:31: error: 'events_sysfs_show' 
undeclared here (not in a function)
   .attr  = __ATTR(_name, 0444, events_sysfs_show, NULL), \
^
 include/linux/sysfs.h:103:10: note: in definition of macro '__ATTR'
   .show = _show,  \
   ^
  >> arch/x86/kernel/cpu/perf_event_amd_power.c:244:1: note: in expansion of 
macro 'EVENT_ATTR_STR'
  EVENT_ATTR_STR(power-pkg, power_pkg, "event=0x01");
  ^

Reported-by: Randy Dunlap 
Reported-by: build test robot 
Signed-off-by: Huang Rui 
Cc: Stephen Rothwell 
Cc: a...@kernel.org
---

Hi Peter, Ingo, Randy,

According to the comments of below thread, I rebased this fix to
latest kernel.

http://lkml.kernel.org/r/56f41e3f.8000...@infradead.org

Orignal patch:
https://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/commit/?h=tip-perf=f5ba42d22cb478d32363b2b8e92e14b1fd190ce1

Thanks,
Rui

---
 arch/x86/events/perf_event.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index ba6ef18..a6771e2 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -795,6 +795,9 @@ ssize_t intel_event_sysfs_show(char *page, u64 config);
 
 struct attribute **merge_attr(struct attribute **a, struct attribute **b);
 
+ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr,
+ char *page);
+
 #ifdef CONFIG_CPU_SUP_AMD
 
 int amd_pmu_init(void);
@@ -925,9 +928,6 @@ int p6_pmu_init(void);
 
 int knc_pmu_init(void);
 
-ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr,
- char *page);
-
 static inline int is_ht_workaround_enabled(void)
 {
return !!(x86_pmu.flags & PMU_FL_EXCL_ENABLED);
-- 
1.9.1

Re: linux-next: Tree for Mar 24 (events/amd/power)

2016-03-24 Thread Huang Rui

On Thu, Mar 24, 2016 at 10:05:03AM -0700, Randy Dunlap wrote:
> On 03/23/16 19:09, Stephen Rothwell wrote:
> > Hi all,
> > 
> > Please do not add any v4.7 related material to your linux-next included
> > trees until after v4.6-rc1 is released.
> > 
> > Changes since 20160323:
> > 
> 
> on i386:
> 
> when CONFIG_CPU_SUP_INTEL is not enabled:
> 
> In file included from ../include/linux/kobject.h:21:0,
>  from ../include/linux/module.h:17,
>  from ../arch/x86/events/amd/power.c:13:
> ../arch/x86/events/amd/../perf_event.h:660:31: error: 'events_sysfs_show' 
> undeclared here (not in a function)
>   .attr  = __ATTR(_name, 0444, events_sysfs_show, NULL), \
>^
> ../include/linux/sysfs.h:103:10: note: in definition of macro '__ATTR'
>   .show = _show,  \
>   ^
> ../arch/x86/events/amd/power.c:181:1: note: in expansion of macro 
> 'EVENT_ATTR_STR'
>  EVENT_ATTR_STR(power-pkg, power_pkg, "event=0x01");
>  ^
> ../scripts/Makefile.build:291: recipe for target 
> 'arch/x86/events/amd/power.o' failed
> 

Randy, thanks to report it.
Actually, because one patch is missed to apply.

https://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/commit/?h=tip-perf=f5ba42d22cb478d32363b2b8e92e14b1fd190ce1

I will rebase and re-send it.

Thanks,
Rui

Re: linux-next: Tree for Mar 24 (events/amd/power)

2016-03-24 Thread Huang Rui

On Thu, Mar 24, 2016 at 10:05:03AM -0700, Randy Dunlap wrote:
> On 03/23/16 19:09, Stephen Rothwell wrote:
> > Hi all,
> > 
> > Please do not add any v4.7 related material to your linux-next included
> > trees until after v4.6-rc1 is released.
> > 
> > Changes since 20160323:
> > 
> 
> on i386:
> 
> when CONFIG_CPU_SUP_INTEL is not enabled:
> 
> In file included from ../include/linux/kobject.h:21:0,
>  from ../include/linux/module.h:17,
>  from ../arch/x86/events/amd/power.c:13:
> ../arch/x86/events/amd/../perf_event.h:660:31: error: 'events_sysfs_show' 
> undeclared here (not in a function)
>   .attr  = __ATTR(_name, 0444, events_sysfs_show, NULL), \
>^
> ../include/linux/sysfs.h:103:10: note: in definition of macro '__ATTR'
>   .show = _show,  \
>   ^
> ../arch/x86/events/amd/power.c:181:1: note: in expansion of macro 
> 'EVENT_ATTR_STR'
>  EVENT_ATTR_STR(power-pkg, power_pkg, "event=0x01");
>  ^
> ../scripts/Makefile.build:291: recipe for target 
> 'arch/x86/events/amd/power.o' failed
> 

Randy, thanks to report it.
Actually, because one patch is missed to apply.

https://git.kernel.org/cgit/linux/kernel/git/bp/bp.git/commit/?h=tip-perf=f5ba42d22cb478d32363b2b8e92e14b1fd190ce1

I will rebase and re-send it.

Thanks,
Rui

1 2 3 4 5 6 7 >

1 - 100 of 672 matches

Mail list logo