Re: Spontaneous reboots when using RX 560

2019-10-19 Thread Sylvain Munaut
Finally some progress !

I found a thread with a couple of people having the same symptoms as I
do ( [1] ), and interestingly that was with the same brand & model of
card.
Although there is no solution, there is a work around that works :

echo -n low  > /sys/class/drm/card0/device/power_dpm_force_performance_level

Then the card seems stable. At least I was able to get through an
entire GL benchmark and also a bunch of CL tests without crashing. (By
default it crashes nearly instantly).
Of course the card is slow but it's better than nothing and maybe
gives a clue to a solution ?

Following some advice on IRC, I also tried setting it to "high". This
doesn't crash immediately when doing that and the display stays fine
and I can move window and light stuff, but trying to actually run GL
or CL stuff and it then crashes.

I also dumped the Power Play tables, see [2]. I can't really
understand them, there is definitely some weird values, but not sure
if that's normal or not.

As I noted earlier in the thread, when I first used the card on
windows, using just AMD's driver the card was stuck at its lowest
clock rate and performed poorly in benchmark. It was only after I
loaded Asrock's own tweak utility that the card started to auto adapt
its clock / voltages.  Not sure if there is a way to dump windows
power play config ?


Cheers,

   Sylvain

[1] 
https://www.phoronix.com/forums/forum/linux-graphics-x-org-drivers/open-source-amd-linux/1112121-rx-560-crash-under-light-load
[2] https://pastebin.com/raw/uWh6WLmh
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: Spontaneous reboots when using RX 560

2019-10-19 Thread Sylvain Munaut
Just in case there was any doubt, seems OpenCL workload crashes the
card just as hard.
(That was the AMDGPU-Pro OpenCL lib, legacy version.  Can't get PAL to
detect the card at all)

Cheers,

 Sylvain
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 1/3] drm/amd/powerplay: add lock protection for swSMU APIs V2

2019-10-19 Thread Xu, Feifei
Acked-by: Feifei Xu 

Thanks,
Feifei

-Original Message-
From: amd-gfx  On Behalf Of Quan, Evan
Sent: Friday, October 18, 2019 10:57 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Grodzovsky, Andrey 
; Quan, Evan 
Subject: [PATCH 1/3] drm/amd/powerplay: add lock protection for swSMU APIs V2

This is a quick and low risk fix. Those APIs which
are exposed to other IPs or to support sysfs/hwmon
interfaces or DAL will have lock protection. Meanwhile
no lock protection is enforced for swSMU internal used
APIs. Future optimization is needed.

V2: strip the lock protection for all swSMU internal APIs

Change-Id: I8392652c9da1574a85acd9b171f04380f3630852
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c   |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h   |   6 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c|  23 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c  |   4 +-
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c| 696 --
 drivers/gpu/drm/amd/powerplay/arcturus_ppt.c  |   3 -
 .../gpu/drm/amd/powerplay/inc/amdgpu_smu.h| 163 ++--
 drivers/gpu/drm/amd/powerplay/navi10_ppt.c|  15 +-
 drivers/gpu/drm/amd/powerplay/renoir_ppt.c|  14 +-
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c |  22 +-
 drivers/gpu/drm/amd/powerplay/smu_v12_0.c |   3 -
 drivers/gpu/drm/amd/powerplay/vega20_ppt.c|  20 +-
 12 files changed, 777 insertions(+), 198 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c
index 263265245e19..28d32725285b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c
@@ -912,7 +912,8 @@ int amdgpu_dpm_get_sclk(struct amdgpu_device *adev, bool 
low)
if (is_support_sw_smu(adev)) {
ret = smu_get_dpm_freq_range(&adev->smu, SMU_GFXCLK,
 low ? &clk_freq : NULL,
-!low ? &clk_freq : NULL);
+!low ? &clk_freq : NULL,
+true);
if (ret)
return 0;
return clk_freq * 100;
@@ -930,7 +931,8 @@ int amdgpu_dpm_get_mclk(struct amdgpu_device *adev, bool 
low)
if (is_support_sw_smu(adev)) {
ret = smu_get_dpm_freq_range(&adev->smu, SMU_UCLK,
 low ? &clk_freq : NULL,
-!low ? &clk_freq : NULL);
+!low ? &clk_freq : NULL,
+true);
if (ret)
return 0;
return clk_freq * 100;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
index 1c5c0fd76dbf..2cfb677272af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
@@ -298,12 +298,6 @@ enum amdgpu_pcie_gen {
 #define amdgpu_dpm_get_current_power_state(adev) \

((adev)->powerplay.pp_funcs->get_current_power_state((adev)->powerplay.pp_handle))
 
-#define amdgpu_smu_get_current_power_state(adev) \
-   ((adev)->smu.ppt_funcs->get_current_power_state(&((adev)->smu)))
-
-#define amdgpu_smu_set_power_state(adev) \
-   ((adev)->smu.ppt_funcs->set_power_state(&((adev)->smu)))
-
 #define amdgpu_dpm_get_pp_num_states(adev, data) \

((adev)->powerplay.pp_funcs->get_pp_num_states((adev)->powerplay.pp_handle, 
data))
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
index c50d5f1e75e5..36f36b35000d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -211,7 +211,7 @@ static ssize_t amdgpu_get_dpm_state(struct device *dev,
 
if (is_support_sw_smu(adev)) {
if (adev->smu.ppt_funcs->get_current_power_state)
-   pm = amdgpu_smu_get_current_power_state(adev);
+   pm = smu_get_current_power_state(&adev->smu);
else
pm = adev->pm.dpm.user_state;
} else if (adev->powerplay.pp_funcs->get_current_power_state) {
@@ -957,7 +957,7 @@ static ssize_t amdgpu_set_pp_dpm_sclk(struct device *dev,
return ret;
 
if (is_support_sw_smu(adev))
-   ret = smu_force_clk_levels(&adev->smu, SMU_SCLK, mask);
+   ret = smu_force_clk_levels(&adev->smu, SMU_SCLK, mask, true);
else if (adev->powerplay.pp_funcs->force_clock_level)
ret = amdgpu_dpm_force_clock_level(adev, PP_SCLK, mask);
 
@@ -1004,7 +1004,7 @@ static ssize_t amdgpu_set_pp_dpm_mclk(struct device *dev,
return ret;
 
if (is_support_sw_smu(adev))
-   ret = smu_force_clk_levels(&adev->smu, SMU_MCLK, mask);
+   ret = smu_force_clk_lev

[PATCH] drm/amd/amdgpu: correct length misspelling

2019-10-19 Thread Wambui Karuga
Correct the "_LENTH" mispelling in the AMDGPU_MAX_TIMEOUT_PARAM_LENGTH
constant.

Signed-off-by: Wambui Karuga 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index c5b3c0c9193b..aaab37833659 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -86,7 +86,7 @@
 #define KMS_DRIVER_MINOR   34
 #define KMS_DRIVER_PATCHLEVEL  0
 
-#define AMDGPU_MAX_TIMEOUT_PARAM_LENTH 256
+#define AMDGPU_MAX_TIMEOUT_PARAM_LENGTH256
 
 int amdgpu_vram_limit = 0;
 int amdgpu_vis_vram_limit = 0;
@@ -100,7 +100,7 @@ int amdgpu_disp_priority = 0;
 int amdgpu_hw_i2c = 0;
 int amdgpu_pcie_gen2 = -1;
 int amdgpu_msi = -1;
-static char amdgpu_lockup_timeout[AMDGPU_MAX_TIMEOUT_PARAM_LENTH];
+static char amdgpu_lockup_timeout[AMDGPU_MAX_TIMEOUT_PARAM_LENGTH];
 int amdgpu_dpm = -1;
 int amdgpu_fw_load_type = -1;
 int amdgpu_aspm = -1;
@@ -1327,9 +1327,9 @@ int amdgpu_device_get_job_timeout_settings(struct 
amdgpu_device *adev)
adev->sdma_timeout = adev->video_timeout = adev->gfx_timeout;
adev->compute_timeout = MAX_SCHEDULE_TIMEOUT;
 
-   if (strnlen(input, AMDGPU_MAX_TIMEOUT_PARAM_LENTH)) {
+   if (strnlen(input, AMDGPU_MAX_TIMEOUT_PARAM_LENGTH)) {
while ((timeout_setting = strsep(&input, ",")) &&
-   strnlen(timeout_setting, 
AMDGPU_MAX_TIMEOUT_PARAM_LENTH)) {
+   strnlen(timeout_setting, 
AMDGPU_MAX_TIMEOUT_PARAM_LENGTH)) {
ret = kstrtol(timeout_setting, 0, &timeout);
if (ret)
return ret;
-- 
2.23.0



[PATCH] drm/radeon: remove assignment for return value

2019-10-19 Thread Wambui Karuga
Remove unnecessary assignment for return value and have the
function return the required value directly.
Issue found by coccinelle:
@@
local idexpression ret;
expression e;
@@

-ret =
+return
 e;
-return ret;

Signed-off-by: Wambui Karuga 
---
 drivers/gpu/drm/radeon/cik.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 62eab82a64f9..daff9a2af3be 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -221,9 +221,7 @@ int ci_get_temp(struct radeon_device *rdev)
else
actual_temp = temp & 0x1ff;
 
-   actual_temp = actual_temp * 1000;
-
-   return actual_temp;
+   return actual_temp * 1000;
 }
 
 /* get temperature in millidegrees */
@@ -239,9 +237,7 @@ int kv_get_temp(struct radeon_device *rdev)
else
actual_temp = 0;
 
-   actual_temp = actual_temp * 1000;
-
-   return actual_temp;
+   return actual_temp * 1000;
 }
 
 /*
-- 
2.23.0



[PATCH] drm/amd/amdgpu: make undeclared variables static

2019-10-19 Thread Wambui Karuga
Make the `amdgpu_lockup_timeout` and `amdgpu_exp_hw_support` variables
static to remove the following sparse warnings:
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:103:19: warning: symbol 
'amdgpu_lockup_timeout' was not declared. Should it be static?
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:117:18: warning: symbol 
'amdgpu_exp_hw_support' was not declared. Should it be static?

Signed-off-by: Wambui Karuga 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 3fae1007143e..c5b3c0c9193b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -100,7 +100,7 @@ int amdgpu_disp_priority = 0;
 int amdgpu_hw_i2c = 0;
 int amdgpu_pcie_gen2 = -1;
 int amdgpu_msi = -1;
-char amdgpu_lockup_timeout[AMDGPU_MAX_TIMEOUT_PARAM_LENTH];
+static char amdgpu_lockup_timeout[AMDGPU_MAX_TIMEOUT_PARAM_LENTH];
 int amdgpu_dpm = -1;
 int amdgpu_fw_load_type = -1;
 int amdgpu_aspm = -1;
@@ -114,7 +114,7 @@ int amdgpu_vm_block_size = -1;
 int amdgpu_vm_fault_stop = 0;
 int amdgpu_vm_debug = 0;
 int amdgpu_vm_update_mode = -1;
-int amdgpu_exp_hw_support = 0;
+static int amdgpu_exp_hw_support;
 int amdgpu_dc = -1;
 int amdgpu_sched_jobs = 32;
 int amdgpu_sched_hw_submission = 2;
-- 
2.23.0