RE: [PATCH 3/3] drm/amdgpu: add debugfs interface for RAP test

2020-08-10 Thread Sheng, Wenhui
[AMD Public Use]

Hi Gunchun,

Thanks for pointing this out, for rap_test file node, I thought I need to use 
the same file mode as ras previously,  it seems that I'm wrong, for this 
rap_test file node, I think only S_IWUSR is enough, what's your opinion?

Brs
Wenhui 

-Original Message-
From: Chen, Guchun  
Sent: Tuesday, August 11, 2020 11:33 AM
To: Sheng, Wenhui ; amd-gfx@lists.freedesktop.org
Cc: Clements, John ; Sheng, Wenhui 
; Zhang, Hawking 
Subject: RE: [PATCH 3/3] drm/amdgpu: add debugfs interface for RAP test

[AMD Public Use]

One comment inline, please check.

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Wenhui Sheng
Sent: Tuesday, August 11, 2020 11:04 AM
To: amd-gfx@lists.freedesktop.org
Cc: Clements, John ; Sheng, Wenhui 
; Zhang, Hawking 
Subject: [PATCH 3/3] drm/amdgpu: add debugfs interface for RAP test

After amdgpu driver loading successfully, we can use RAP debugfs interface 
/dri/xxx/rap_test to trigger RAP test.

Currently only L0 validate test is supported.

v2: refine amdgpu_rap.h

Signed-off-by: Wenhui Sheng 
---
 drivers/gpu/drm/amd/amdgpu/Makefile |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c | 127 
 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.h |  28 +
 4 files changed, 159 insertions(+), 1 deletion(-)  create mode 100644 
drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 0ba396e9d7e4..dec1927ca75d 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -55,7 +55,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o \
amdgpu_gmc.o amdgpu_mmhub.o amdgpu_xgmi.o amdgpu_csa.o amdgpu_ras.o 
amdgpu_vm_cpu.o \
amdgpu_vm_sdma.o amdgpu_discovery.o amdgpu_ras_eeprom.o amdgpu_nbio.o \
-   amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o
+   amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o
 
 amdgpu-$(CONFIG_PERF_EVENTS) += amdgpu_pmu.o
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 3a4b31b1c4f2..0af249a1e35b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -34,6 +34,7 @@
 #include "amdgpu_pm.h"
 #include "amdgpu_dm_debugfs.h"
 #include "amdgpu_ras.h"
+#include "amdgpu_rap.h"
 
 /**
  * amdgpu_debugfs_add_files - Add simple debugfs entries @@ -1623,6 +1624,8 @@ 
int amdgpu_debugfs_init(struct amdgpu_device *adev)
 
amdgpu_debugfs_autodump_init(adev);
 
+   amdgpu_rap_debugfs_init(adev);
+
return amdgpu_debugfs_add_files(adev, amdgpu_debugfs_list,
ARRAY_SIZE(amdgpu_debugfs_list));
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
new file mode 100644
index ..46b2607a7bd1
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
@@ -0,0 +1,127 @@
+/*
+ * Copyright 2020 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person 
+obtaining a
+ * copy of this software and associated documentation files (the 
+"Software"),
+ * to deal in the Software without restriction, including without 
+limitation
+ * the rights to use, copy, modify, merge, publish, distribute, 
+sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom 
+the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be 
+included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
+EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
+MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT 
+SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, 
+DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
+OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE 
+OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ *
+ */
+#include 
+#include 
+
+#include "amdgpu.h"
+#include "amdgpu_rap.h"
+
+/**
+ * DOC: AMDGPU RAP debugfs test interface
+ *
+ * how to use?
+ * echo opcode > /dri/xxx/rap_test
+ *
+ * opcode:
+ * currently, only 2 is supported by Linux host driver,
+ * opcode 2 stands for TA_CMD_RAP__VALIDATE_L0, used to
+ * trigger L0 policy validation, you can refer more detail
+ * from header file ta_rap_if.h
+ *
+ */
+static ssize_t amdgpu_rap_debugfs_write(struct file *f, const char __user *buf,
+   size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device 
*)file_inode(f)->i_private;
+   struct 

[PATCH] drm/amd/powerplay: minor cleanups

2020-08-10 Thread Evan Quan
Drop unnecessary lock protections during hw setup which was confirmed
to have no race condition. Drop also unnecessary null pointer checker.

Change-Id: Ida301ae7bad1abae15285c4e019eda4f7dc6e297
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c| 20 
 drivers/gpu/drm/amd/powerplay/arcturus_ppt.c  |  2 -
 .../gpu/drm/amd/powerplay/inc/amdgpu_smu.h|  1 -
 drivers/gpu/drm/amd/powerplay/navi10_ppt.c|  2 -
 .../drm/amd/powerplay/sienna_cichlid_ppt.c|  2 -
 drivers/gpu/drm/amd/powerplay/smu_internal.h  |  1 +
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 50 ---
 7 files changed, 11 insertions(+), 67 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c 
b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
index 1ffacc712e53..c70f94377644 100644
--- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
@@ -361,20 +361,16 @@ static int smu_get_driver_allowed_feature_mask(struct 
smu_context *smu)
int ret = 0;
uint32_t allowed_feature_mask[SMU_FEATURE_MAX/32];
 
-   mutex_lock(>mutex);
bitmap_zero(feature->allowed, SMU_FEATURE_MAX);
-   mutex_unlock(>mutex);
 
ret = smu_get_allowed_feature_mask(smu, allowed_feature_mask,
 SMU_FEATURE_MAX/32);
if (ret)
return ret;
 
-   mutex_lock(>mutex);
bitmap_or(feature->allowed, feature->allowed,
  (unsigned long *)allowed_feature_mask,
  feature->feature_num);
-   mutex_unlock(>mutex);
 
return ret;
 }
@@ -576,9 +572,6 @@ static int smu_fini_fb_allocations(struct smu_context *smu)
struct smu_table *tables = smu_table->tables;
struct smu_table *driver_table = &(smu_table->driver_table);
 
-   if (!tables)
-   return 0;
-
if (tables[SMU_TABLE_PMSTATUSLOG].mc_address)
amdgpu_bo_free_kernel([SMU_TABLE_PMSTATUSLOG].bo,
  [SMU_TABLE_PMSTATUSLOG].mc_address,
@@ -2250,19 +2243,6 @@ int smu_set_deep_sleep_dcefclk(struct smu_context *smu, 
int clk)
return ret;
 }
 
-int smu_set_active_display_count(struct smu_context *smu, uint32_t count)
-{
-   int ret = 0;
-
-   if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
-   return -EOPNOTSUPP;
-
-   if (smu->ppt_funcs->set_active_display_count)
-   ret = smu->ppt_funcs->set_active_display_count(smu, count);
-
-   return ret;
-}
-
 int smu_get_clock_by_type(struct smu_context *smu,
  enum amd_pp_clock_type type,
  struct amd_pp_clocks *clocks)
diff --git a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c 
b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
index 8b1025dc54fd..25679d60f7b4 100644
--- a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
@@ -386,11 +386,9 @@ static int arcturus_check_powerplay_table(struct 
smu_context *smu)
table_context->power_play_table;
struct smu_baco_context *smu_baco = >smu_baco;
 
-   mutex_lock(_baco->mutex);
if (powerplay_table->platform_caps & SMU_11_0_PP_PLATFORM_CAP_BACO ||
powerplay_table->platform_caps & SMU_11_0_PP_PLATFORM_CAP_MACO)
smu_baco->platform_support = true;
-   mutex_unlock(_baco->mutex);
 
table_context->thermal_controller_type =
powerplay_table->thermal_controller_type;
diff --git a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
index 23c2279bd500..8de39b31e7c2 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
@@ -698,7 +698,6 @@ int smu_set_fan_speed_percent(struct smu_context *smu, 
uint32_t speed);
 int smu_get_fan_speed_rpm(struct smu_context *smu, uint32_t *speed);
 
 int smu_set_deep_sleep_dcefclk(struct smu_context *smu, int clk);
-int smu_set_active_display_count(struct smu_context *smu, uint32_t count);
 
 int smu_get_clock_by_type(struct smu_context *smu,
  enum amd_pp_clock_type type,
diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c 
b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
index 42a60769c52f..61e2971be9f3 100644
--- a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
@@ -346,11 +346,9 @@ static int navi10_check_powerplay_table(struct smu_context 
*smu)
if (powerplay_table->platform_caps & 
SMU_11_0_PP_PLATFORM_CAP_HARDWAREDC)
smu->dc_controlled_by_gpio = true;
 
-   mutex_lock(_baco->mutex);
if (powerplay_table->platform_caps & SMU_11_0_PP_PLATFORM_CAP_BACO ||
powerplay_table->platform_caps & SMU_11_0_PP_PLATFORM_CAP_MACO)
smu_baco->platform_support = true;
-   mutex_unlock(_baco->mutex);
 

[PATCH] drm/amd/powerplay: bump NAVI12 driver if version

2020-08-10 Thread Evan Quan
To fit the latest SMU firmware.

Change-Id: Ic9d02de54d20b6b90d18bac8b3fbb356d8fdf3ad
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h 
b/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h
index ee1506beb0ea..65363d56e3cc 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h
@@ -28,7 +28,7 @@
 #define SMU11_DRIVER_IF_VERSION_INV 0x
 #define SMU11_DRIVER_IF_VERSION_ARCT 0x17
 #define SMU11_DRIVER_IF_VERSION_NV10 0x36
-#define SMU11_DRIVER_IF_VERSION_NV12 0x33
+#define SMU11_DRIVER_IF_VERSION_NV12 0x36
 #define SMU11_DRIVER_IF_VERSION_NV14 0x36
 #define SMU11_DRIVER_IF_VERSION_Sienna_Cichlid 0x35
 #define SMU11_DRIVER_IF_VERSION_Navy_Flounder 0x3
-- 
2.28.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/2] drm/amd/powerplay: maximum the code sharing around metrics table retrieving

2020-08-10 Thread Evan Quan
Instead of having one copy in each ASIC.

Change-Id: I5e2a72382700cdb0e4847e5d9e2143f4b5508cdb
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/arcturus_ppt.c  | 55 ++-
 drivers/gpu/drm/amd/powerplay/navi10_ppt.c| 41 +++---
 drivers/gpu/drm/amd/powerplay/renoir_ppt.c| 40 ++
 .../drm/amd/powerplay/sienna_cichlid_ppt.c| 55 ++-
 drivers/gpu/drm/amd/powerplay/smu_cmn.c   | 45 +++
 drivers/gpu/drm/amd/powerplay/smu_cmn.h   |  8 +++
 6 files changed, 77 insertions(+), 167 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c 
b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
index e59e6fb6f0a8..8b1025dc54fd 100644
--- a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
@@ -541,49 +541,6 @@ static int arcturus_freqs_in_same_level(int32_t frequency1,
return (abs(frequency1 - frequency2) <= EPSILON);
 }
 
-static int arcturus_get_metrics_table_locked(struct smu_context *smu,
-SmuMetrics_t *metrics_table,
-bool bypass_cache)
-{
-   struct smu_table_context *smu_table= >smu_table;
-   int ret = 0;
-
-   if (bypass_cache ||
-   !smu_table->metrics_time ||
-   time_after(jiffies, smu_table->metrics_time + msecs_to_jiffies(1))) 
{
-   ret = smu_cmn_update_table(smu,
-  SMU_TABLE_SMU_METRICS,
-  0,
-  smu_table->metrics_table,
-  false);
-   if (ret) {
-   dev_info(smu->adev->dev, "Failed to export SMU metrics 
table!\n");
-   return ret;
-   }
-   smu_table->metrics_time = jiffies;
-   }
-
-   if (metrics_table)
-   memcpy(metrics_table, smu_table->metrics_table, 
sizeof(SmuMetrics_t));
-
-   return 0;
-}
-
-static int arcturus_get_metrics_table(struct smu_context *smu,
- SmuMetrics_t *metrics_table,
- bool bypass_cache)
-{
-   int ret = 0;
-
-   mutex_lock(>metrics_lock);
-   ret = arcturus_get_metrics_table_locked(smu,
-   metrics_table,
-   bypass_cache);
-   mutex_unlock(>metrics_lock);
-
-   return ret;
-}
-
 static int arcturus_get_smu_metrics_data(struct smu_context *smu,
 MetricsMember_t member,
 uint32_t *value)
@@ -594,9 +551,9 @@ static int arcturus_get_smu_metrics_data(struct smu_context 
*smu,
 
mutex_lock(>metrics_lock);
 
-   ret = arcturus_get_metrics_table_locked(smu,
-   NULL,
-   false);
+   ret = smu_cmn_get_metrics_table_locked(smu,
+  NULL,
+  false);
if (ret) {
mutex_unlock(>metrics_lock);
return ret;
@@ -2305,9 +2262,9 @@ static ssize_t arcturus_get_gpu_metrics(struct 
smu_context *smu,
SmuMetrics_t metrics;
int ret = 0;
 
-   ret = arcturus_get_metrics_table(smu,
-,
-true);
+   ret = smu_cmn_get_metrics_table(smu,
+   ,
+   true);
if (ret)
return ret;
 
diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c 
b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
index 82659b781f05..42a60769c52f 100644
--- a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
@@ -504,34 +504,6 @@ static int navi10_tables_init(struct smu_context *smu)
return -ENOMEM;
 }
 
-static int navi10_get_metrics_table_locked(struct smu_context *smu,
-  SmuMetrics_t *metrics_table,
-  bool bypass_cache)
-{
-   struct smu_table_context *smu_table= >smu_table;
-   int ret = 0;
-
-   if (bypass_cache ||
-   !smu_table->metrics_time ||
-   time_after(jiffies, smu_table->metrics_time + msecs_to_jiffies(1))) 
{
-   ret = smu_cmn_update_table(smu,
-  SMU_TABLE_SMU_METRICS,
-  0,
-  smu_table->metrics_table,
-  false);
-   if (ret) {
-   dev_info(smu->adev->dev, "Failed to export SMU metrics 
table!\n");
-   return ret;
-   }
-   smu_table->metrics_time = 

[PATCH 1/2] drm/amd/powerplay: update the metrics table cache interval as 1ms

2020-08-10 Thread Evan Quan
To make the setting same as Arcturus/Navi1x/Sienna_Cichlid.

Change-Id: I7ea721bf5872023f1ab39c3827fb9c6fd05877cc
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c | 2 +-
 drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c | 2 +-
 drivers/gpu/drm/amd/powerplay/renoir_ppt.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
index c70c30175801..f0680dd58508 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
@@ -1272,7 +1272,7 @@ static int vega12_get_metrics_table(struct pp_hwmgr 
*hwmgr,
 
if (bypass_cache ||
!data->metrics_time ||
-   time_after(jiffies, data->metrics_time + HZ / 2)) {
+   time_after(jiffies, data->metrics_time + msecs_to_jiffies(1))) {
ret = smum_smc_table_manager(hwmgr,
 (uint8_t *)(>metrics_table),
 TABLE_SMU_METRICS,
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
index c9f402edc0d6..da84012b7fd5 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
@@ -2082,7 +2082,7 @@ static int vega20_get_metrics_table(struct pp_hwmgr 
*hwmgr,
 
if (bypass_cache ||
!data->metrics_time ||
-   time_after(jiffies, data->metrics_time + HZ / 2)) {
+   time_after(jiffies, data->metrics_time + msecs_to_jiffies(1))) {
ret = smum_smc_table_manager(hwmgr,
 (uint8_t *)(>metrics_table),
 TABLE_SMU_METRICS,
diff --git a/drivers/gpu/drm/amd/powerplay/renoir_ppt.c 
b/drivers/gpu/drm/amd/powerplay/renoir_ppt.c
index 8a8e6033f71f..c50c4547fea9 100644
--- a/drivers/gpu/drm/amd/powerplay/renoir_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/renoir_ppt.c
@@ -139,7 +139,7 @@ static int renoir_get_metrics_table(struct smu_context *smu,
 
if (bypass_cache ||
!smu_table->metrics_time ||
-   time_after(jiffies, smu_table->metrics_time + 
msecs_to_jiffies(100))) {
+   time_after(jiffies, smu_table->metrics_time + msecs_to_jiffies(1))) 
{
ret = smu_cmn_update_table(smu, SMU_TABLE_SMU_METRICS, 0,
(void *)smu_table->metrics_table, false);
if (ret) {
-- 
2.28.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 3/3] drm/amdgpu: add debugfs interface for RAP test

2020-08-10 Thread Chen, Guchun
[AMD Public Use]

One comment inline, please check.

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Wenhui Sheng
Sent: Tuesday, August 11, 2020 11:04 AM
To: amd-gfx@lists.freedesktop.org
Cc: Clements, John ; Sheng, Wenhui 
; Zhang, Hawking 
Subject: [PATCH 3/3] drm/amdgpu: add debugfs interface for RAP test

After amdgpu driver loading successfully, we can use RAP debugfs interface 
/dri/xxx/rap_test to trigger RAP test.

Currently only L0 validate test is supported.

v2: refine amdgpu_rap.h

Signed-off-by: Wenhui Sheng 
---
 drivers/gpu/drm/amd/amdgpu/Makefile |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c | 127 
 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.h |  28 +
 4 files changed, 159 insertions(+), 1 deletion(-)  create mode 100644 
drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 0ba396e9d7e4..dec1927ca75d 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -55,7 +55,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o \
amdgpu_gmc.o amdgpu_mmhub.o amdgpu_xgmi.o amdgpu_csa.o amdgpu_ras.o 
amdgpu_vm_cpu.o \
amdgpu_vm_sdma.o amdgpu_discovery.o amdgpu_ras_eeprom.o amdgpu_nbio.o \
-   amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o
+   amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o
 
 amdgpu-$(CONFIG_PERF_EVENTS) += amdgpu_pmu.o
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 3a4b31b1c4f2..0af249a1e35b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -34,6 +34,7 @@
 #include "amdgpu_pm.h"
 #include "amdgpu_dm_debugfs.h"
 #include "amdgpu_ras.h"
+#include "amdgpu_rap.h"
 
 /**
  * amdgpu_debugfs_add_files - Add simple debugfs entries @@ -1623,6 +1624,8 @@ 
int amdgpu_debugfs_init(struct amdgpu_device *adev)
 
amdgpu_debugfs_autodump_init(adev);
 
+   amdgpu_rap_debugfs_init(adev);
+
return amdgpu_debugfs_add_files(adev, amdgpu_debugfs_list,
ARRAY_SIZE(amdgpu_debugfs_list));
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
new file mode 100644
index ..46b2607a7bd1
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
@@ -0,0 +1,127 @@
+/*
+ * Copyright 2020 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person 
+obtaining a
+ * copy of this software and associated documentation files (the 
+"Software"),
+ * to deal in the Software without restriction, including without 
+limitation
+ * the rights to use, copy, modify, merge, publish, distribute, 
+sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom 
+the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be 
+included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
+EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
+MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT 
+SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, 
+DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
+OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE 
+OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ *
+ */
+#include 
+#include 
+
+#include "amdgpu.h"
+#include "amdgpu_rap.h"
+
+/**
+ * DOC: AMDGPU RAP debugfs test interface
+ *
+ * how to use?
+ * echo opcode > /dri/xxx/rap_test
+ *
+ * opcode:
+ * currently, only 2 is supported by Linux host driver,
+ * opcode 2 stands for TA_CMD_RAP__VALIDATE_L0, used to
+ * trigger L0 policy validation, you can refer more detail
+ * from header file ta_rap_if.h
+ *
+ */
+static ssize_t amdgpu_rap_debugfs_write(struct file *f, const char __user *buf,
+   size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device 
*)file_inode(f)->i_private;
+   struct ta_rap_shared_memory *rap_shared_mem;
+   struct ta_rap_cmd_output_data *rap_cmd_output;
+   struct drm_device *dev = adev->ddev;
+   uint32_t op;
+   int ret;
+
+   if (*pos || size != 2)
+   return -EINVAL;
+
+   ret = kstrtouint_from_user(buf, size, *pos, );
+   if (ret)
+   return ret;
+
+   ret = pm_runtime_get_sync(dev->dev);
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
+   return ret;
+   }
+
+   /* make sure gfx core is 

[PATCH 3/3] drm/amdgpu: add debugfs interface for RAP test

2020-08-10 Thread Wenhui Sheng
After amdgpu driver loading successfully, we can use
RAP debugfs interface /dri/xxx/rap_test
to trigger RAP test.

Currently only L0 validate test is supported.

v2: refine amdgpu_rap.h

Signed-off-by: Wenhui Sheng 
---
 drivers/gpu/drm/amd/amdgpu/Makefile |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c | 127 
 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.h |  28 +
 4 files changed, 159 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
b/drivers/gpu/drm/amd/amdgpu/Makefile
index 0ba396e9d7e4..dec1927ca75d 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -55,7 +55,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o \
amdgpu_gmc.o amdgpu_mmhub.o amdgpu_xgmi.o amdgpu_csa.o amdgpu_ras.o 
amdgpu_vm_cpu.o \
amdgpu_vm_sdma.o amdgpu_discovery.o amdgpu_ras_eeprom.o amdgpu_nbio.o \
-   amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o
+   amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o
 
 amdgpu-$(CONFIG_PERF_EVENTS) += amdgpu_pmu.o
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 3a4b31b1c4f2..0af249a1e35b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -34,6 +34,7 @@
 #include "amdgpu_pm.h"
 #include "amdgpu_dm_debugfs.h"
 #include "amdgpu_ras.h"
+#include "amdgpu_rap.h"
 
 /**
  * amdgpu_debugfs_add_files - Add simple debugfs entries
@@ -1623,6 +1624,8 @@ int amdgpu_debugfs_init(struct amdgpu_device *adev)
 
amdgpu_debugfs_autodump_init(adev);
 
+   amdgpu_rap_debugfs_init(adev);
+
return amdgpu_debugfs_add_files(adev, amdgpu_debugfs_list,
ARRAY_SIZE(amdgpu_debugfs_list));
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
new file mode 100644
index ..46b2607a7bd1
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
@@ -0,0 +1,127 @@
+/*
+ * Copyright 2020 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ *
+ */
+#include 
+#include 
+
+#include "amdgpu.h"
+#include "amdgpu_rap.h"
+
+/**
+ * DOC: AMDGPU RAP debugfs test interface
+ *
+ * how to use?
+ * echo opcode > /dri/xxx/rap_test
+ *
+ * opcode:
+ * currently, only 2 is supported by Linux host driver,
+ * opcode 2 stands for TA_CMD_RAP__VALIDATE_L0, used to
+ * trigger L0 policy validation, you can refer more detail
+ * from header file ta_rap_if.h
+ *
+ */
+static ssize_t amdgpu_rap_debugfs_write(struct file *f, const char __user *buf,
+   size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device 
*)file_inode(f)->i_private;
+   struct ta_rap_shared_memory *rap_shared_mem;
+   struct ta_rap_cmd_output_data *rap_cmd_output;
+   struct drm_device *dev = adev->ddev;
+   uint32_t op;
+   int ret;
+
+   if (*pos || size != 2)
+   return -EINVAL;
+
+   ret = kstrtouint_from_user(buf, size, *pos, );
+   if (ret)
+   return ret;
+
+   ret = pm_runtime_get_sync(dev->dev);
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
+   return ret;
+   }
+
+   /* make sure gfx core is on, RAP TA cann't handle
+* GFX OFF case currently.
+*/
+   amdgpu_gfx_off_ctrl(adev, false);
+
+   switch (op) {
+   case 2:
+   ret = psp_rap_invoke(>psp, op);
+
+   if (ret == TA_RAP_STATUS__SUCCESS) {
+   dev_info(adev->dev, "RAP L0 validate test success.\n");
+   } else {
+  

[PATCH 1/3] drm/amdgpu: add RAP TA header file

2020-08-10 Thread Wenhui Sheng
The RAP TA contains tests used to verify if
RAP(Register Access Policy), or otherwise known
as Security Policy is applied correctly
by PSP BL

The RAP test is a measure to ensure that we reduce
the avenue of complexity and mistakes when dealing
with RAP in post-si execution, where debugging failures
related to RAP is quite difficult and expensive.

v2: add introduction for RAP TA

Signed-off-by: Wenhui Sheng 
---
 drivers/gpu/drm/amd/amdgpu/ta_rap_if.h | 84 ++
 1 file changed, 84 insertions(+)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/ta_rap_if.h

diff --git a/drivers/gpu/drm/amd/amdgpu/ta_rap_if.h 
b/drivers/gpu/drm/amd/amdgpu/ta_rap_if.h
new file mode 100644
index ..f14833fae07c
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/ta_rap_if.h
@@ -0,0 +1,84 @@
+/*
+ * Copyright 2020 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef _TA_RAP_IF_H
+#define _TA_RAP_IF_H
+
+/* Responses have bit 31 set */
+#define RSP_ID_MASK (1U << 31)
+#define RSP_ID(cmdId) (((uint32_t)(cmdId)) | RSP_ID_MASK)
+
+enum ta_rap_status {
+   TA_RAP_STATUS__SUCCESS  = 1,
+   TA_RAP_STATUS__ERROR_GENERIC_FAILURE= 2,
+   TA_RAP_STATUS__ERROR_CMD_NOT_SUPPORTED  = 3,
+   TA_RAP_STATUS__ERROR_INVALID_VALIDATION_METHOD  = 4,
+   TA_RAP_STATUS__ERROR_NULL_POINTER   = 5,
+   TA_RAP_STATUS__ERROR_NOT_INITIALIZED= 6,
+   TA_RAP_STATUS__ERROR_VALIDATION_FAILED  = 7,
+   TA_RAP_STATUS__ERROR_ASIC_NOT_SUPPORTED = 8,
+   TA_RAP_STATUS__ERROR_OPERATION_NOT_PERMISSABLE  = 9,
+   TA_RAP_STATUS__ERROR_ALREADY_INIT   = 10,
+};
+
+enum ta_rap_cmd {
+   TA_CMD_RAP__INITIALIZE  = 1,
+   TA_CMD_RAP__VALIDATE_L0 = 2,
+};
+
+enum ta_rap_validation_method {
+   METHOD_A   = 1,
+};
+
+struct ta_rap_cmd_input_data {
+   uint8_t reserved[8];
+};
+
+struct ta_rap_cmd_output_data {
+   uint32_tlast_subsection;
+   uint32_tnum_total_validate;
+   uint32_tnum_valid;
+   uint32_tlast_validate_addr;
+   uint32_tlast_validate_val;
+   uint32_tlast_validate_val_exptd;
+};
+
+union ta_rap_cmd_input {
+   struct ta_rap_cmd_input_data input;
+};
+
+union ta_rap_cmd_output {
+   struct ta_rap_cmd_output_data output;
+};
+
+struct ta_rap_shared_memory {
+   uint32_tcmd_id;
+   uint32_tvalidation_method_id;
+   uint32_tresp_id;
+   enum ta_rap_status  rap_status;
+   union ta_rap_cmd_input  rap_in_message;
+   union ta_rap_cmd_output rap_out_message;
+   uint8_t reserved[64];
+};
+
+#endif  // #define _TA_RAP_IF_H
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/3] drm/amdgpu: enable RAP TA load

2020-08-10 Thread Wenhui Sheng
Enable the RAP TA loading path and add RAP test
trigger interface.

v2: fix potential mem leak issue

Signed-off-by: Wenhui Sheng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c   | 183 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h   |  17 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h |   1 +
 3 files changed, 201 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index c68369731b20..116a89990f39 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -1430,6 +1430,168 @@ static int psp_dtm_terminate(struct psp_context *psp)
 }
 // DTM end
 
+// RAP start
+static int psp_rap_init_shared_buf(struct psp_context *psp)
+{
+   int ret;
+
+   /*
+* Allocate 16k memory aligned to 4k from Frame Buffer (local
+* physical) for rap ta <-> Driver
+*/
+   ret = amdgpu_bo_create_kernel(psp->adev, PSP_RAP_SHARED_MEM_SIZE,
+ PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM,
+ >rap_context.rap_shared_bo,
+ >rap_context.rap_shared_mc_addr,
+ >rap_context.rap_shared_buf);
+
+   return ret;
+}
+
+static int psp_rap_load(struct psp_context *psp)
+{
+   int ret;
+   struct psp_gfx_cmd_resp *cmd;
+
+   cmd = kzalloc(sizeof(struct psp_gfx_cmd_resp), GFP_KERNEL);
+   if (!cmd)
+   return -ENOMEM;
+
+   memset(psp->fw_pri_buf, 0, PSP_1_MEG);
+   memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size);
+
+   psp_prep_ta_load_cmd_buf(cmd,
+psp->fw_pri_mc_addr,
+psp->ta_rap_ucode_size,
+psp->rap_context.rap_shared_mc_addr,
+PSP_RAP_SHARED_MEM_SIZE);
+
+   ret = psp_cmd_submit_buf(psp, NULL, cmd, psp->fence_buf_mc_addr);
+
+   if (!ret) {
+   psp->rap_context.rap_initialized = true;
+   psp->rap_context.session_id = cmd->resp.session_id;
+   mutex_init(>rap_context.mutex);
+   }
+
+   kfree(cmd);
+
+   return ret;
+}
+
+static int psp_rap_unload(struct psp_context *psp)
+{
+   int ret;
+   struct psp_gfx_cmd_resp *cmd;
+
+   cmd = kzalloc(sizeof(struct psp_gfx_cmd_resp), GFP_KERNEL);
+   if (!cmd)
+   return -ENOMEM;
+
+   psp_prep_ta_unload_cmd_buf(cmd, psp->rap_context.session_id);
+
+   ret = psp_cmd_submit_buf(psp, NULL, cmd, psp->fence_buf_mc_addr);
+
+   kfree(cmd);
+
+   return ret;
+}
+
+static int psp_rap_initialize(struct psp_context *psp)
+{
+   int ret;
+
+   /*
+* TODO: bypass the initialize in sriov for now
+*/
+   if (amdgpu_sriov_vf(psp->adev))
+   return 0;
+
+   if (!psp->adev->psp.ta_rap_ucode_size ||
+   !psp->adev->psp.ta_rap_start_addr) {
+   dev_info(psp->adev->dev, "RAP: optional rap ta ucode is not 
available\n");
+   return 0;
+   }
+
+   if (!psp->rap_context.rap_initialized) {
+   ret = psp_rap_init_shared_buf(psp);
+   if (ret)
+   return ret;
+   }
+
+   ret = psp_rap_load(psp);
+   if (ret)
+   return ret;
+
+   ret = psp_rap_invoke(psp, TA_CMD_RAP__INITIALIZE);
+   if (ret != TA_RAP_STATUS__SUCCESS) {
+   psp_rap_unload(psp);
+
+   amdgpu_bo_free_kernel(>rap_context.rap_shared_bo,
+ >rap_context.rap_shared_mc_addr,
+ >rap_context.rap_shared_buf);
+
+   psp->rap_context.rap_initialized = false;
+
+   dev_warn(psp->adev->dev, "RAP TA initialize fail.\n");
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+static int psp_rap_terminate(struct psp_context *psp)
+{
+   int ret;
+
+   if (!psp->rap_context.rap_initialized)
+   return 0;
+
+   ret = psp_rap_unload(psp);
+
+   psp->rap_context.rap_initialized = false;
+
+   /* free rap shared memory */
+   amdgpu_bo_free_kernel(>rap_context.rap_shared_bo,
+ >rap_context.rap_shared_mc_addr,
+ >rap_context.rap_shared_buf);
+
+   return ret;
+}
+
+int psp_rap_invoke(struct psp_context *psp, uint32_t ta_cmd_id)
+{
+   struct ta_rap_shared_memory *rap_cmd;
+   int ret;
+
+   if (!psp->rap_context.rap_initialized)
+   return -EINVAL;
+
+   if (ta_cmd_id != TA_CMD_RAP__INITIALIZE &&
+   ta_cmd_id != TA_CMD_RAP__VALIDATE_L0)
+   return -EINVAL;
+
+   mutex_lock(>rap_context.mutex);
+
+   rap_cmd = (struct ta_rap_shared_memory *)
+ psp->rap_context.rap_shared_buf;
+   memset(rap_cmd, 0, sizeof(struct ta_rap_shared_memory));
+
+   rap_cmd->cmd_id = ta_cmd_id;

[PATCH v3] drm/amdgpu: annotate a false positive recursive locking

2020-08-10 Thread Dennis Li
[  584.110304] 
[  584.110590] WARNING: possible recursive locking detected
[  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G   OE
[  584.64] 
[  584.111456] kworker/38:1/553 is trying to acquire lock:
[  584.111721] 9b15ff0a47a0 (>reset_sem){}, at: 
amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.112112]
   but task is already holding lock:
[  584.112673] 9b1603d247a0 (>reset_sem){}, at: 
amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.113068]
   other info that might help us debug this:
[  584.113689]  Possible unsafe locking scenario:

[  584.114350]CPU0
[  584.114685]
[  584.115014]   lock(>reset_sem);
[  584.115349]   lock(>reset_sem);
[  584.115678]
*** DEADLOCK ***

[  584.116624]  May be due to missing lock nesting notation

[  584.117284] 4 locks held by kworker/38:1/553:
[  584.117616]  #0: 9ad635c1d348 ((wq_completion)events){+.+.}, at: 
process_one_work+0x21f/0x630
[  584.117967]  #1: ac708e1c3e58 
((work_completion)(>recovery_work)){+.+.}, at: process_one_work+0x21f/0x630
[  584.118358]  #2: c1c2a5d0 (>hive_lock){+.+.}, at: 
amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu]
[  584.118786]  #3: 9b1603d247a0 (>reset_sem){}, at: 
amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.119222]
   stack backtrace:
[  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G 
  OE 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
[  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 
05/23/2019
[  584.121223] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
[  584.121638] Call Trace:
[  584.122050]  dump_stack+0x98/0xd5
[  584.122499]  __lock_acquire+0x1139/0x16e0
[  584.122931]  ? trace_hardirqs_on+0x3b/0xf0
[  584.123358]  ? cancel_delayed_work+0xa6/0xc0
[  584.123771]  lock_acquire+0xb8/0x1c0
[  584.124197]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.124599]  down_write+0x49/0x120
[  584.125032]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.125472]  amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.125910]  ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu]
[  584.126367]  amdgpu_ras_do_recovery+0x159/0x190 [amdgpu]
[  584.126789]  process_one_work+0x29e/0x630
[  584.127208]  worker_thread+0x3c/0x3f0
[  584.127621]  ? __kthread_parkme+0x61/0x90
[  584.128014]  kthread+0x12f/0x150
[  584.128402]  ? process_one_work+0x630/0x630
[  584.128790]  ? kthread_park+0x90/0x90
[  584.129174]  ret_from_fork+0x3a/0x50

Each adev has owned lock_class_key to avoid false positive
recursive locking.

v2:
1. register adev->lock_key into lockdep, otherwise lockdep will
report the below warning

[ 1216.705820] BUG: key 890183b647d0 has not been registered!
[ 1216.705924] [ cut here ]
[ 1216.705972] DEBUG_LOCKS_WARN_ON(1)
[ 1216.705997] WARNING: CPU: 20 PID: 541 at kernel/locking/lockdep.c:3743 
lockdep_init_map+0x150/0x210

v3:
change to use down_write_nest_lock to annotate the false dead-lock
warning.

Signed-off-by: Dennis Li 

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac97fbd2..8a55b0bc044a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4145,12 +4145,15 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info 
*hive,
return r;
 }
 
-static bool amdgpu_device_lock_adev(struct amdgpu_device *adev)
+static bool amdgpu_device_lock_adev(struct amdgpu_device *adev, struct 
amdgpu_hive_info *hive)
 {
if (atomic_cmpxchg(>in_gpu_reset, 0, 1) != 0)
return false;
 
-   down_write(>reset_sem);
+   if (hive) {
+   down_write_nest_lock(>reset_sem, >hive_lock);
+   } else
+   down_write(>reset_sem);
 
atomic_inc(>gpu_reset_counter);
switch (amdgpu_asic_reset_method(adev)) {
@@ -4312,7 +4315,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 
/* block all schedulers and reset given job's ring */
list_for_each_entry(tmp_adev, device_list_handle, gmc.xgmi.head) {
-   if (!amdgpu_device_lock_adev(tmp_adev)) {
+   if (!amdgpu_device_lock_adev(tmp_adev, hive)) {
DRM_INFO("Bailing on TDR for s_job:%llx, as another 
already in progress",
  job ? job->base.id : -1);
r = 0;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm: amdgpu: Use the correct size when allocating memory

2020-08-10 Thread Alex Deucher
Applied and updated the commit message to reflect the sizes.

Thanks!

Alex

On Mon, Aug 10, 2020 at 3:07 PM Marion & Christophe JAILLET
 wrote:
>
>
> Le 10/08/2020 à 17:42, Dan Carpenter a écrit :
> > On Sun, Aug 09, 2020 at 10:34:06PM +0200, Christophe JAILLET wrote:
> >> When '*sgt' is allocated, we must allocated 'sizeof(**sgt)' bytes instead
> >> of 'sizeof(*sg)'. 'sg' (i.e. struct scatterlist) is smaller than
> >> 'sgt' (i.e struct sg_table), so this could lead to memory corruption.
> > The sizeof(*sg) is bigger than sizeof(**sgt) so this wastes memory but
> > it won't lead to corruption.
> >
> >  11  struct scatterlist {
> >  12  unsigned long   page_link;
> >  13  unsigned intoffset;
> >  14  unsigned intlength;
> >  15  dma_addr_t  dma_address;
> >  16  #ifdef CONFIG_NEED_SG_DMA_LENGTH
> >  17  unsigned intdma_length;
> >  18  #endif
> >  19  };
> >
> >  42  struct sg_table {
> >  43  struct scatterlist *sgl;/* the list */
> >  44  unsigned int nents; /* number of mapped 
> > entries */
> >  45  unsigned int orig_nents;/* original size of list */
> >  46  };
> >
> > regards,
> > dan carpenter
>
>
> My bad. I read 'struct scatterlist sgl' (without the *)
> Thanks for the follow-up, Dan.
>
> Doesn't smatch catch such mismatch?
> (I've not run smatch for a while, so it is maybe reported)
>
> Well, the proposal is still valid, even if it has less impact as
> initially thought.
>
> Thx for the review.
>
> CJ
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: convert to use le16_add_cpu()

2020-08-10 Thread Alex Deucher
Applied.  Thanks!

Alex

On Mon, Aug 10, 2020 at 9:05 AM Qinglang Miao  wrote:
>
> Convert cpu_to_le16(le16_to_cpu(E1) + E2) to use le16_add_cpu().
>
> Signed-off-by: Qinglang Miao 
> ---
>  drivers/gpu/drm/amd/display/dc/bios/command_table.c  | 4 +---
>  drivers/gpu/drm/amd/display/dc/bios/command_table2.c | 5 +
>  2 files changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table.c 
> b/drivers/gpu/drm/amd/display/dc/bios/command_table.c
> index 5815983ca..070459e3e 100644
> --- a/drivers/gpu/drm/amd/display/dc/bios/command_table.c
> +++ b/drivers/gpu/drm/amd/display/dc/bios/command_table.c
> @@ -1877,9 +1877,7 @@ static enum bp_result set_crtc_using_dtd_timing_v3(
>  * but it is 4 either from Edid data (spec CEA 861)
>  * or CEA timing table.
>  */
> -   params.usV_SyncOffset =
> -   
> cpu_to_le16(le16_to_cpu(params.usV_SyncOffset) + 1);
> -
> +   le16_add_cpu(_SyncOffset, 1);
> }
> }
>
> diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c 
> b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
> index bed91572f..e8f52eb8e 100644
> --- a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
> +++ b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
> @@ -569,10 +569,7 @@ static enum bp_result set_crtc_using_dtd_timing_v3(
>  * but it is 4 either from Edid data (spec CEA 861)
>  * or CEA timing table.
>  */
> -   params.v_syncoffset =
> -   cpu_to_le16(le16_to_cpu(params.v_syncoffset) +
> -   1);
> -
> +   le16_add_cpu(_syncoffset, 1);
> }
> }
>
> --
> 2.25.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Non-deterministically boot into dark screen with `amdgpu`

2020-08-10 Thread Alex Deucher
On Mon, Aug 10, 2020 at 7:46 AM Christian König
 wrote:
>
> Hi guys,
>
> Am 10.08.20 um 08:43 schrieb Alexander Monakov:
>
> Hi,
>
> you should Сс a specialized mailing list and a relevant maintainer,
> otherwise your email is likely to be ignored as LKML is an incredibly
> high-volume list. Adding amd-gfx and Alex Deucher.
>
>
> Thanks for forwarding this. AFAIK we haven't heard of this bug before, but 
> Alex already might know more about it.
>
> More thoughts below.
>
> On Sun, 9 Aug 2020, Ignat Insarov wrote:
>
> Hello!
>
> This is an issue report. I am not familiar with the Linux kernel
> development procedure, so please direct me to a more appropriate or
> specialized medium if this is not the right avenue.
>
> My laptop (Ryzen 7 Pro CPU/GPU) boots into dark screen more often than
> not. Screen blackness correlates with a line in the `systemd` journal
> that says `RAM width Nbits DDR4`, where N is either 128 (resulting in
> dark screen) or 64 (resulting in a healthy boot). The number seems to
> be chosen at random with bias towards 128. This has been going on for
> a while so here is some statistics:
>
> * 356 boots proceed far enough to  attempt mode setting.
> * 82 boots set RAM width to 64 bits and presumably succeed.
> * 274 boots set RAM width to 128 bits and presumably fail.
>
> The issue is prevented with the `nomodeset` kernel option.
>
> I reported this previously (about a year ago) on the forum of my Linux
> distribution.[1] The issue still persists as of  linux 5.8.0.
>
> The details of my graphics controller, as well as some journal
> excerpts, can be seen at [1]. One thing that has changed since then is
> that on failure, there now appears a null pointer dereference error. I
> am attaching the log of kernel messages from the most recent failed
> boot — please request more information if needed.
>
> I appreciate any directions and advice as to how I may go about fixing
> this annoyance.
>
> [1]: https://bbs.archlinux.org/viewtopic.php?id=248273
>
> On the forum you show that in the "success" case there's one less "BIOS
> signature incorrect" message. This implies that amdgpu_get_bios() in
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
> gets the video BIOS from a different source. If that happens every time
> (one "signature incorrect" message for "success", two for "failure")
> that may be relevant to the problem you're experiencing.
>
> If you don't mind patching and rebuilding the kernel I suggest adding
> debug printks to the aforementioned function to see exactly which methods
> fail with wrong signature and which succeeds.
>
> Also might be worthwhile to check if there's a BIOS update for your laptop.
>
>
> It might also be a good idea to try the latest amd-staging-drm-next branch 
> from Alex repository (bear with me I don't have the link at hand, but it 
> should be easy to find).
>
> Opening a bug report or searching the existing ones for something similar 
> under https://gitlab.freedesktop.org/drm/amd/-/issues might be a good idea as 
> well.
>
> And I completely agree that this sounds like an issue getting the BIOS image.

I've not heard of an issue like this either.  Best to file a gitlab
bug and attach your full dmesg output in both the working and
non-working cases and we can go from there.

Alex

>
> Thanks,
> Christian.
>
>
> Alexander
>
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 4.9 04/17] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync

2020-08-10 Thread Sasha Levin
From: Aditya Pakki 

[ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ]

On calling pm_runtime_get_sync() the reference count of the device
is incremented. In case of failure, decrement the
reference count before returning the error.

Acked-by: Evan Quan 
Signed-off-by: Aditya Pakki 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_display.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_drv.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_kms.c | 4 +++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index 432ad7d73cb9b..99e23800cadc7 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -639,8 +639,10 @@ radeon_crtc_set_config(struct drm_mode_set *set)
dev = set->crtc->dev;
 
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_crtc_helper_set_config(set);
 
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 30bd4a6a9d466..7648fd0d10751 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -496,8 +496,10 @@ long radeon_drm_ioctl(struct file *filp,
long ret;
dev = file_priv->minor->dev;
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_ioctl(filp, cmd, arg);

diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
b/drivers/gpu/drm/radeon/radeon_kms.c
index 4388ddeec8d24..96d2a564d9a3c 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -634,8 +634,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct 
drm_file *file_priv)
file_priv->driver_priv = NULL;
 
r = pm_runtime_get_sync(dev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return r;
+   }
 
/* new gpu have virtual address space support */
if (rdev->family >= CHIP_CAYMAN) {
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 4.4 04/16] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync

2020-08-10 Thread Sasha Levin
From: Aditya Pakki 

[ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ]

On calling pm_runtime_get_sync() the reference count of the device
is incremented. In case of failure, decrement the
reference count before returning the error.

Acked-by: Evan Quan 
Signed-off-by: Aditya Pakki 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_display.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_drv.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_kms.c | 4 +++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index 4572bfba017c5..17c73b8c90e71 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -660,8 +660,10 @@ radeon_crtc_set_config(struct drm_mode_set *set)
dev = set->crtc->dev;
 
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_crtc_helper_set_config(set);
 
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 5b6a6f5b3619e..401403a3ea50c 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -527,8 +527,10 @@ long radeon_drm_ioctl(struct file *filp,
long ret;
dev = file_priv->minor->dev;
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_ioctl(filp, cmd, arg);

diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
b/drivers/gpu/drm/radeon/radeon_kms.c
index d290a8a09036e..41caf7da90548 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -631,8 +631,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct 
drm_file *file_priv)
file_priv->driver_priv = NULL;
 
r = pm_runtime_get_sync(dev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return r;
+   }
 
/* new gpu have virtual address space support */
if (rdev->family >= CHIP_CAYMAN) {
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 4.14 04/22] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync

2020-08-10 Thread Sasha Levin
From: Aditya Pakki 

[ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ]

On calling pm_runtime_get_sync() the reference count of the device
is incremented. In case of failure, decrement the
reference count before returning the error.

Acked-by: Evan Quan 
Signed-off-by: Aditya Pakki 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_display.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_drv.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_kms.c | 4 +++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index d86110cdf0852..b2334349799d1 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -627,8 +627,10 @@ radeon_crtc_set_config(struct drm_mode_set *set,
dev = set->crtc->dev;
 
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_crtc_helper_set_config(set, ctx);
 
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index f6908e2f9e55a..41e8abd099784 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -496,8 +496,10 @@ long radeon_drm_ioctl(struct file *filp,
long ret;
dev = file_priv->minor->dev;
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_ioctl(filp, cmd, arg);

diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
b/drivers/gpu/drm/radeon/radeon_kms.c
index dfee8f7d94ae5..2e28cf8118404 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -659,8 +659,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct 
drm_file *file_priv)
file_priv->driver_priv = NULL;
 
r = pm_runtime_get_sync(dev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return r;
+   }
 
/* new gpu have virtual address space support */
if (rdev->family >= CHIP_CAYMAN) {
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 4.19 16/31] drm/radeon: disable AGP by default

2020-08-10 Thread Sasha Levin
From: Christian König 

[ Upstream commit ba806f98f868ce107aa9c453fef751de9980e4af ]

Always use the PCI GART instead. We just have to many cases
where AGP still causes problems. This means a performance
regression for some GPUs, but also a bug fix for some others.

Signed-off-by: Christian König 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_drv.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 54729acd0d4af..0cd33289c2b63 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -168,12 +168,7 @@ int radeon_no_wb;
 int radeon_modeset = -1;
 int radeon_dynclks = -1;
 int radeon_r4xx_atom = 0;
-#ifdef __powerpc__
-/* Default to PCI on PowerPC (fdo #95017) */
 int radeon_agpmode = -1;
-#else
-int radeon_agpmode = 0;
-#endif
 int radeon_vram_limit = 0;
 int radeon_gart_size = -1; /* auto */
 int radeon_benchmarking = 0;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 4.19 06/31] drm/amdgpu: avoid dereferencing a NULL pointer

2020-08-10 Thread Sasha Levin
From: Jack Xiao 

[ Upstream commit 55611b507fd6453d26030c0c0619fdf0c262766d ]

Check if irq_src is NULL to avoid dereferencing a NULL pointer,
for MES ring is uneccessary to recieve an interrupt notification.

Signed-off-by: Jack Xiao 
Acked-by: Alex Deucher 
Reviewed-by: Hawking Zhang 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 869ff624b108c..e5e51e4d4f3d8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -396,7 +396,9 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
ring->fence_drv.gpu_addr = adev->uvd.inst[ring->me].gpu_addr + 
index;
}
amdgpu_fence_write(ring, atomic_read(>fence_drv.last_seq));
-   amdgpu_irq_get(adev, irq_src, irq_type);
+
+   if (irq_src)
+   amdgpu_irq_get(adev, irq_src, irq_type);
 
ring->fence_drv.irq_src = irq_src;
ring->fence_drv.irq_type = irq_type;
@@ -508,8 +510,9 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
/* no need to trigger GPU reset as we are unloading */
amdgpu_fence_driver_force_completion(ring);
}
-   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
-  ring->fence_drv.irq_type);
+   if (ring->fence_drv.irq_src)
+   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
+  ring->fence_drv.irq_type);
drm_sched_fini(>sched);
del_timer_sync(>fence_drv.fallback_timer);
for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
@@ -545,8 +548,9 @@ void amdgpu_fence_driver_suspend(struct amdgpu_device *adev)
}
 
/* disable the interrupt */
-   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
-  ring->fence_drv.irq_type);
+   if (ring->fence_drv.irq_src)
+   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
+  ring->fence_drv.irq_type);
}
 }
 
@@ -572,8 +576,9 @@ void amdgpu_fence_driver_resume(struct amdgpu_device *adev)
continue;
 
/* enable the interrupt */
-   amdgpu_irq_get(adev, ring->fence_drv.irq_src,
-  ring->fence_drv.irq_type);
+   if (ring->fence_drv.irq_src)
+   amdgpu_irq_get(adev, ring->fence_drv.irq_src,
+  ring->fence_drv.irq_type);
}
 }
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.4 28/45] drm/amd/powerplay: fix compile error with ARCH=arc

2020-08-10 Thread Sasha Levin
From: Evan Quan 

[ Upstream commit 9822ba2ead1baa3de4860ad9472f652c4cc78c9c ]

Fix the compile error below:
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c: In function 
'smu_v11_0_init_microcode':
>> arch/arc/include/asm/bug.h:22:2: error: implicit declaration of function 
>> 'pr_warn'; did you mean 'pci_warn'? [-Werror=implicit-function-declaration]
  22 |  pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, 
__func__); \
 |  ^~~
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:176:3: note: in expansion 
of macro 'BUG'
 176 |   BUG();

Reported-by: kernel test robot 
Signed-off-by: Evan Quan 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c 
b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
index 0922d9cd858a0..c4d8c52c6b9ca 100644
--- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
@@ -171,7 +171,8 @@ static int smu_v11_0_init_microcode(struct smu_context *smu)
chip_name = "navi12";
break;
default:
-   BUG();
+   dev_err(adev->dev, "Unsupported ASIC type %d\n", 
adev->asic_type);
+   return -EINVAL;
}
 
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_smc.bin", chip_name);
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 4.19 07/31] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync

2020-08-10 Thread Sasha Levin
From: Aditya Pakki 

[ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ]

On calling pm_runtime_get_sync() the reference count of the device
is incremented. In case of failure, decrement the
reference count before returning the error.

Acked-by: Evan Quan 
Signed-off-by: Aditya Pakki 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_display.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_drv.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_kms.c | 4 +++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index 7d1e14f0140a2..3f0f3a578ddf0 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -625,8 +625,10 @@ radeon_crtc_set_config(struct drm_mode_set *set,
dev = set->crtc->dev;
 
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_crtc_helper_set_config(set, ctx);
 
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index c26f09b47ecb2..54729acd0d4af 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -523,8 +523,10 @@ long radeon_drm_ioctl(struct file *filp,
long ret;
dev = file_priv->minor->dev;
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_ioctl(filp, cmd, arg);

diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
b/drivers/gpu/drm/radeon/radeon_kms.c
index 3ff835767ac58..34b3cb6c146f9 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -627,8 +627,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct 
drm_file *file_priv)
file_priv->driver_priv = NULL;
 
r = pm_runtime_get_sync(dev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return r;
+   }
 
/* new gpu have virtual address space support */
if (rdev->family >= CHIP_CAYMAN) {
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.4 27/45] drm/amdgpu/display bail early in dm_pp_get_static_clocks

2020-08-10 Thread Sasha Levin
From: Alex Deucher 

[ Upstream commit 376814f5fcf1aadda501d1413d56e8af85d19a97 ]

If there are no supported callbacks.  We'll fall back to the
nominal clocks.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1170
Reviewed-by: Evan Quan 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
index 785322cd4c6c9..7241d4c207789 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
@@ -530,6 +530,8 @@ bool dm_pp_get_static_clocks(
_clk_info);
else if (adev->smu.funcs)
ret = smu_get_current_clocks(>smu, _clk_info);
+   else
+   return false;
if (ret)
return false;
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.4 24/45] drm/radeon: disable AGP by default

2020-08-10 Thread Sasha Levin
From: Christian König 

[ Upstream commit ba806f98f868ce107aa9c453fef751de9980e4af ]

Always use the PCI GART instead. We just have to many cases
where AGP still causes problems. This means a performance
regression for some GPUs, but also a bug fix for some others.

Signed-off-by: Christian König 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_drv.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 7d417b9a52501..c2573096d43c0 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -174,12 +174,7 @@ int radeon_no_wb;
 int radeon_modeset = -1;
 int radeon_dynclks = -1;
 int radeon_r4xx_atom = 0;
-#ifdef __powerpc__
-/* Default to PCI on PowerPC (fdo #95017) */
 int radeon_agpmode = -1;
-#else
-int radeon_agpmode = 0;
-#endif
 int radeon_vram_limit = 0;
 int radeon_gart_size = -1; /* auto */
 int radeon_benchmarking = 0;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.4 07/45] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync

2020-08-10 Thread Sasha Levin
From: Aditya Pakki 

[ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ]

On calling pm_runtime_get_sync() the reference count of the device
is incremented. In case of failure, decrement the
reference count before returning the error.

Acked-by: Evan Quan 
Signed-off-by: Aditya Pakki 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_display.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_drv.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_kms.c | 4 +++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index 0826efd9b5f51..f9f74150d0d73 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -631,8 +631,10 @@ radeon_crtc_set_config(struct drm_mode_set *set,
dev = set->crtc->dev;
 
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_crtc_helper_set_config(set, ctx);
 
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 6128792ab8836..7d417b9a52501 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -555,8 +555,10 @@ long radeon_drm_ioctl(struct file *filp,
long ret;
dev = file_priv->minor->dev;
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_ioctl(filp, cmd, arg);

diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
b/drivers/gpu/drm/radeon/radeon_kms.c
index 2bb0187c5bc78..709c4ef5e7d59 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -638,8 +638,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct 
drm_file *file_priv)
file_priv->driver_priv = NULL;
 
r = pm_runtime_get_sync(dev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return r;
+   }
 
/* new gpu have virtual address space support */
if (rdev->family >= CHIP_CAYMAN) {
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.7 44/60] drm/amd/powerplay: suppress compile error around BUG_ON

2020-08-10 Thread Sasha Levin
From: Evan Quan 

[ Upstream commit 75bc07e2403caea9ecac69f766dfb7dc33547594 ]

To suppress the compile error below for "ARCH=arc".
   drivers/gpu/drm/amd/amdgpu/../powerplay/arcturus_ppt.c: In function 
'arcturus_fill_eeprom_i2c_req':
>> arch/arc/include/asm/bug.h:22:2: error: implicit declaration of function 
>> 'pr_warn'; did you mean 'pci_warn'? [-Werror=implicit-function-declaration]
  22 |  pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, 
__func__); \
 |  ^~~
   include/asm-generic/bug.h:62:57: note: in expansion of macro 'BUG'
  62 | #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } 
while (0)
 | ^~~
   drivers/gpu/drm/amd/amdgpu/../powerplay/arcturus_ppt.c:2157:2: note: in 
expansion of macro 'BUG_ON'
2157 |  BUG_ON(numbytes > MAX_SW_I2C_COMMANDS);

Signed-off-by: Evan Quan 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c 
b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
index 1ef0923f71906..9ad0e6f18be49 100644
--- a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
@@ -2035,8 +2035,6 @@ static void arcturus_fill_eeprom_i2c_req(SwI2cRequest_t  
*req, bool write,
 {
int i;
 
-   BUG_ON(numbytes > MAX_SW_I2C_COMMANDS);
-
req->I2CcontrollerPort = 0;
req->I2CSpeed = 2;
req->SlaveAddress = address;
@@ -2074,6 +2072,12 @@ static int arcturus_i2c_eeprom_read_data(struct 
i2c_adapter *control,
struct smu_table_context *smu_table = >smu.smu_table;
struct smu_table *table = _table->driver_table;
 
+   if (numbytes > MAX_SW_I2C_COMMANDS) {
+   dev_err(adev->dev, "numbytes requested %d is over max allowed 
%d\n",
+   numbytes, MAX_SW_I2C_COMMANDS);
+   return -EINVAL;
+   }
+
memset(, 0, sizeof(req));
arcturus_fill_eeprom_i2c_req(, false, address, numbytes, data);
 
@@ -2110,6 +2114,12 @@ static int arcturus_i2c_eeprom_write_data(struct 
i2c_adapter *control,
SwI2cRequest_t req;
struct amdgpu_device *adev = to_amdgpu_device(control);
 
+   if (numbytes > MAX_SW_I2C_COMMANDS) {
+   dev_err(adev->dev, "numbytes requested %d is over max allowed 
%d\n",
+   numbytes, MAX_SW_I2C_COMMANDS);
+   return -EINVAL;
+   }
+
memset(, 0, sizeof(req));
arcturus_fill_eeprom_i2c_req(, true, address, numbytes, data);
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.4 06/45] drm/amdgpu: avoid dereferencing a NULL pointer

2020-08-10 Thread Sasha Levin
From: Jack Xiao 

[ Upstream commit 55611b507fd6453d26030c0c0619fdf0c262766d ]

Check if irq_src is NULL to avoid dereferencing a NULL pointer,
for MES ring is uneccessary to recieve an interrupt notification.

Signed-off-by: Jack Xiao 
Acked-by: Alex Deucher 
Reviewed-by: Hawking Zhang 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 23085b352cf2d..c212d5fc665c6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -404,7 +404,9 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
ring->fence_drv.gpu_addr = adev->uvd.inst[ring->me].gpu_addr + 
index;
}
amdgpu_fence_write(ring, atomic_read(>fence_drv.last_seq));
-   amdgpu_irq_get(adev, irq_src, irq_type);
+
+   if (irq_src)
+   amdgpu_irq_get(adev, irq_src, irq_type);
 
ring->fence_drv.irq_src = irq_src;
ring->fence_drv.irq_type = irq_type;
@@ -539,8 +541,9 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
/* no need to trigger GPU reset as we are unloading */
amdgpu_fence_driver_force_completion(ring);
}
-   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
-  ring->fence_drv.irq_type);
+   if (ring->fence_drv.irq_src)
+   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
+  ring->fence_drv.irq_type);
drm_sched_fini(>sched);
del_timer_sync(>fence_drv.fallback_timer);
for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
@@ -576,8 +579,9 @@ void amdgpu_fence_driver_suspend(struct amdgpu_device *adev)
}
 
/* disable the interrupt */
-   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
-  ring->fence_drv.irq_type);
+   if (ring->fence_drv.irq_src)
+   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
+  ring->fence_drv.irq_type);
}
 }
 
@@ -603,8 +607,9 @@ void amdgpu_fence_driver_resume(struct amdgpu_device *adev)
continue;
 
/* enable the interrupt */
-   amdgpu_irq_get(adev, ring->fence_drv.irq_src,
-  ring->fence_drv.irq_type);
+   if (ring->fence_drv.irq_src)
+   amdgpu_irq_get(adev, ring->fence_drv.irq_src,
+  ring->fence_drv.irq_type);
}
 }
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.7 35/60] drm/amd/powerplay: fix compile error with ARCH=arc

2020-08-10 Thread Sasha Levin
From: Evan Quan 

[ Upstream commit 9822ba2ead1baa3de4860ad9472f652c4cc78c9c ]

Fix the compile error below:
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c: In function 
'smu_v11_0_init_microcode':
>> arch/arc/include/asm/bug.h:22:2: error: implicit declaration of function 
>> 'pr_warn'; did you mean 'pci_warn'? [-Werror=implicit-function-declaration]
  22 |  pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, 
__func__); \
 |  ^~~
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:176:3: note: in expansion 
of macro 'BUG'
 176 |   BUG();

Reported-by: kernel test robot 
Signed-off-by: Evan Quan 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c 
b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
index 655ba4fb05dcd..48af305d42d54 100644
--- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
@@ -159,7 +159,8 @@ int smu_v11_0_init_microcode(struct smu_context *smu)
chip_name = "navi12";
break;
default:
-   BUG();
+   dev_err(adev->dev, "Unsupported ASIC type %d\n", 
adev->asic_type);
+   return -EINVAL;
}
 
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_smc.bin", chip_name);
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.7 28/60] drm/radeon: disable AGP by default

2020-08-10 Thread Sasha Levin
From: Christian König 

[ Upstream commit ba806f98f868ce107aa9c453fef751de9980e4af ]

Always use the PCI GART instead. We just have to many cases
where AGP still causes problems. This means a performance
regression for some GPUs, but also a bug fix for some others.

Signed-off-by: Christian König 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_drv.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 0cf0b00a0623e..6f0d1971099b7 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -171,12 +171,7 @@ int radeon_no_wb;
 int radeon_modeset = -1;
 int radeon_dynclks = -1;
 int radeon_r4xx_atom = 0;
-#ifdef __powerpc__
-/* Default to PCI on PowerPC (fdo #95017) */
 int radeon_agpmode = -1;
-#else
-int radeon_agpmode = 0;
-#endif
 int radeon_vram_limit = 0;
 int radeon_gart_size = -1; /* auto */
 int radeon_benchmarking = 0;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.7 09/60] drm/amdgpu: avoid dereferencing a NULL pointer

2020-08-10 Thread Sasha Levin
From: Jack Xiao 

[ Upstream commit 55611b507fd6453d26030c0c0619fdf0c262766d ]

Check if irq_src is NULL to avoid dereferencing a NULL pointer,
for MES ring is uneccessary to recieve an interrupt notification.

Signed-off-by: Jack Xiao 
Acked-by: Alex Deucher 
Reviewed-by: Hawking Zhang 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 7531527067dfb..892c1e9a1eb04 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -408,7 +408,9 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
ring->fence_drv.gpu_addr = adev->uvd.inst[ring->me].gpu_addr + 
index;
}
amdgpu_fence_write(ring, atomic_read(>fence_drv.last_seq));
-   amdgpu_irq_get(adev, irq_src, irq_type);
+
+   if (irq_src)
+   amdgpu_irq_get(adev, irq_src, irq_type);
 
ring->fence_drv.irq_src = irq_src;
ring->fence_drv.irq_type = irq_type;
@@ -529,8 +531,9 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
/* no need to trigger GPU reset as we are unloading */
amdgpu_fence_driver_force_completion(ring);
}
-   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
-  ring->fence_drv.irq_type);
+   if (ring->fence_drv.irq_src)
+   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
+  ring->fence_drv.irq_type);
drm_sched_fini(>sched);
del_timer_sync(>fence_drv.fallback_timer);
for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
@@ -566,8 +569,9 @@ void amdgpu_fence_driver_suspend(struct amdgpu_device *adev)
}
 
/* disable the interrupt */
-   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
-  ring->fence_drv.irq_type);
+   if (ring->fence_drv.irq_src)
+   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
+  ring->fence_drv.irq_type);
}
 }
 
@@ -593,8 +597,9 @@ void amdgpu_fence_driver_resume(struct amdgpu_device *adev)
continue;
 
/* enable the interrupt */
-   amdgpu_irq_get(adev, ring->fence_drv.irq_src,
-  ring->fence_drv.irq_type);
+   if (ring->fence_drv.irq_src)
+   amdgpu_irq_get(adev, ring->fence_drv.irq_src,
+  ring->fence_drv.irq_type);
}
 }
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.7 34/60] drm/amdgpu/display: properly guard the calls to swSMU functions

2020-08-10 Thread Sasha Levin
From: Alex Deucher 

[ Upstream commit 4072327a2622af8688b88f5cd0a472136d3bf33d ]

It's only applicable on newer asics.  We could end up here when
using DC on older asics like SI or KV.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1170
Reviewed-by: Evan Quan 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
index 7cee8070cb113..5c6a6ae48d396 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
@@ -106,7 +106,7 @@ bool dm_pp_apply_display_requirements(
adev->powerplay.pp_funcs->display_configuration_change(
adev->powerplay.pp_handle,
>pm.pm_display_cfg);
-   else
+   else if (adev->smu.ppt_funcs)
smu_display_configuration_change(smu,
 
>pm.pm_display_cfg);
 
@@ -592,7 +592,7 @@ void pp_rv_set_wm_ranges(struct pp_smu *pp,
if (pp_funcs && pp_funcs->set_watermarks_for_clocks_ranges)
pp_funcs->set_watermarks_for_clocks_ranges(pp_handle,
   
_with_clock_ranges);
-   else
+   else if (adev->smu.ppt_funcs)
smu_set_watermarks_for_clock_ranges(>smu,
_with_clock_ranges);
 }
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.7 32/60] drm/amd/display: Improve DisplayPort monitor interop

2020-08-10 Thread Sasha Levin
From: Aric Cyr 

[ Upstream commit eec3303de3378cdfaa0bb86f43546dbbd88f94e2 ]

[Why]
DC is very fast at link training and stream enablement
which causes issues such as blackscreens for non-compliant
monitors.

[How]
After debugging with scaler vendors we implement the
minimum delays at the necessary locations to ensure
the monitor does not hang.  Delays are generic due to
lack of IEEE OUI information on the failing displays.

Signed-off-by: Aric Cyr 
Reviewed-by: Wenjing Liu 
Acked-by: Qingqing Zhuo 
Acked-by: Tony Cheng 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c|  4 +++-
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 16 ++--
 .../amd/display/dc/dce110/dce110_hw_sequencer.c  | 11 ++-
 3 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 67cfff1586e9f..3f157bcc174b9 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -3146,9 +3146,11 @@ void core_link_disable_stream(struct pipe_ctx *pipe_ctx)
write_i2c_redriver_setting(pipe_ctx, false);
}
}
-   dc->hwss.disable_stream(pipe_ctx);
 
disable_link(pipe_ctx->stream->link, pipe_ctx->stream->signal);
+
+   dc->hwss.disable_stream(pipe_ctx);
+
if (pipe_ctx->stream->timing.flags.DSC) {
if (dc_is_dp_signal(pipe_ctx->stream->signal))
dp_set_dsc_enable(pipe_ctx, false);
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index caa090d0b6acc..1ada01322cd2c 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -1103,6 +1103,10 @@ static inline enum link_training_result 
perform_link_training_int(
dpcd_pattern.v1_4.TRAINING_PATTERN_SET = 
DPCD_TRAINING_PATTERN_VIDEOIDLE;
dpcd_set_training_pattern(link, dpcd_pattern);
 
+   /* delay 5ms after notifying sink of idle pattern before switching 
output */
+   if (link->connector_signal != SIGNAL_TYPE_EDP)
+   msleep(5);
+
/* 4. mainlink output idle pattern*/
dp_set_hw_test_pattern(link, DP_TEST_PATTERN_VIDEO_MODE, NULL, 0);
 
@@ -1552,6 +1556,12 @@ bool perform_link_training_with_retries(
struct dc_link *link = stream->link;
enum dp_panel_mode panel_mode = dp_get_panel_mode(link);
 
+   /* We need to do this before the link training to ensure the idle 
pattern in SST
+* mode will be sent right after the link training
+*/
+   link->link_enc->funcs->connect_dig_be_to_fe(link->link_enc,
+   
pipe_ctx->stream_res.stream_enc->id, true);
+
for (j = 0; j < attempts; ++j) {
 
dp_enable_link_phy(
@@ -1568,12 +1578,6 @@ bool perform_link_training_with_retries(
 
dp_set_panel_mode(link, panel_mode);
 
-   /* We need to do this before the link training to ensure the 
idle pattern in SST
-* mode will be sent right after the link training
-*/
-   link->link_enc->funcs->connect_dig_be_to_fe(link->link_enc,
-   
pipe_ctx->stream_res.stream_enc->id, true);
-
if (link->aux_access_disabled) {
dc_link_dp_perform_link_training_skip_aux(link, 
link_setting);
return true;
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index 10527593868cc..24ca592c90df5 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -1090,8 +1090,17 @@ void dce110_blank_stream(struct pipe_ctx *pipe_ctx)
dc_link_set_abm_disable(link);
}
 
-   if (dc_is_dp_signal(pipe_ctx->stream->signal))
+   if (dc_is_dp_signal(pipe_ctx->stream->signal)) {

pipe_ctx->stream_res.stream_enc->funcs->dp_blank(pipe_ctx->stream_res.stream_enc);
+
+   /*
+* After output is idle pattern some sinks need time to 
recognize the stream
+* has changed or they enter protection state and hang.
+*/
+   if (!dc_is_embedded_signal(pipe_ctx->stream->signal))
+   msleep(60);
+   }
+
 }
 
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.7 33/60] drm/amdgpu/display bail early in dm_pp_get_static_clocks

2020-08-10 Thread Sasha Levin
From: Alex Deucher 

[ Upstream commit 376814f5fcf1aadda501d1413d56e8af85d19a97 ]

If there are no supported callbacks.  We'll fall back to the
nominal clocks.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1170
Reviewed-by: Evan Quan 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
index a2e1a73f66b81..7cee8070cb113 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
@@ -530,6 +530,8 @@ bool dm_pp_get_static_clocks(
_clk_info);
else if (adev->smu.ppt_funcs)
ret = smu_get_current_clocks(>smu, _clk_info);
+   else
+   return false;
if (ret)
return false;
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.7 10/60] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync

2020-08-10 Thread Sasha Levin
From: Aditya Pakki 

[ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ]

On calling pm_runtime_get_sync() the reference count of the device
is incremented. In case of failure, decrement the
reference count before returning the error.

Acked-by: Evan Quan 
Signed-off-by: Aditya Pakki 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_display.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_drv.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_kms.c | 4 +++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index 35db79a168bf7..df1a7eb736517 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -635,8 +635,10 @@ radeon_crtc_set_config(struct drm_mode_set *set,
dev = set->crtc->dev;
 
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_crtc_helper_set_config(set, ctx);
 
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 59f8186a24151..0cf0b00a0623e 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -549,8 +549,10 @@ long radeon_drm_ioctl(struct file *filp,
long ret;
dev = file_priv->minor->dev;
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_ioctl(filp, cmd, arg);

diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
b/drivers/gpu/drm/radeon/radeon_kms.c
index 58176db85952c..779e4cd86245d 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -638,8 +638,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct 
drm_file *file_priv)
file_priv->driver_priv = NULL;
 
r = pm_runtime_get_sync(dev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return r;
+   }
 
/* new gpu have virtual address space support */
if (rdev->family >= CHIP_CAYMAN) {
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.8 37/64] drm/amdgpu/display: properly guard the calls to swSMU functions

2020-08-10 Thread Sasha Levin
From: Alex Deucher 

[ Upstream commit 4072327a2622af8688b88f5cd0a472136d3bf33d ]

It's only applicable on newer asics.  We could end up here when
using DC on older asics like SI or KV.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1170
Reviewed-by: Evan Quan 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
index 7cee8070cb113..5c6a6ae48d396 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
@@ -106,7 +106,7 @@ bool dm_pp_apply_display_requirements(
adev->powerplay.pp_funcs->display_configuration_change(
adev->powerplay.pp_handle,
>pm.pm_display_cfg);
-   else
+   else if (adev->smu.ppt_funcs)
smu_display_configuration_change(smu,
 
>pm.pm_display_cfg);
 
@@ -592,7 +592,7 @@ void pp_rv_set_wm_ranges(struct pp_smu *pp,
if (pp_funcs && pp_funcs->set_watermarks_for_clocks_ranges)
pp_funcs->set_watermarks_for_clocks_ranges(pp_handle,
   
_with_clock_ranges);
-   else
+   else if (adev->smu.ppt_funcs)
smu_set_watermarks_for_clock_ranges(>smu,
_with_clock_ranges);
 }
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.8 36/64] drm/amdgpu/display bail early in dm_pp_get_static_clocks

2020-08-10 Thread Sasha Levin
From: Alex Deucher 

[ Upstream commit 376814f5fcf1aadda501d1413d56e8af85d19a97 ]

If there are no supported callbacks.  We'll fall back to the
nominal clocks.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1170
Reviewed-by: Evan Quan 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
index a2e1a73f66b81..7cee8070cb113 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
@@ -530,6 +530,8 @@ bool dm_pp_get_static_clocks(
_clk_info);
else if (adev->smu.ppt_funcs)
ret = smu_get_current_clocks(>smu, _clk_info);
+   else
+   return false;
if (ret)
return false;
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.8 34/64] drm/amd/display: Improve DisplayPort monitor interop

2020-08-10 Thread Sasha Levin
From: Aric Cyr 

[ Upstream commit eec3303de3378cdfaa0bb86f43546dbbd88f94e2 ]

[Why]
DC is very fast at link training and stream enablement
which causes issues such as blackscreens for non-compliant
monitors.

[How]
After debugging with scaler vendors we implement the
minimum delays at the necessary locations to ensure
the monitor does not hang.  Delays are generic due to
lack of IEEE OUI information on the failing displays.

Signed-off-by: Aric Cyr 
Reviewed-by: Wenjing Liu 
Acked-by: Qingqing Zhuo 
Acked-by: Tony Cheng 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c|  4 +++-
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 16 ++--
 .../amd/display/dc/dce110/dce110_hw_sequencer.c  | 11 ++-
 3 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 48ab51533d5d6..841cc051b7d01 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -3298,9 +3298,11 @@ void core_link_disable_stream(struct pipe_ctx *pipe_ctx)
write_i2c_redriver_setting(pipe_ctx, false);
}
}
-   dc->hwss.disable_stream(pipe_ctx);
 
disable_link(pipe_ctx->stream->link, pipe_ctx->stream->signal);
+
+   dc->hwss.disable_stream(pipe_ctx);
+
if (pipe_ctx->stream->timing.flags.DSC) {
if (dc_is_dp_signal(pipe_ctx->stream->signal))
dp_set_dsc_enable(pipe_ctx, false);
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 91cd884d6f257..6124af571bff6 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -1102,6 +1102,10 @@ static inline enum link_training_result 
perform_link_training_int(
dpcd_pattern.v1_4.TRAINING_PATTERN_SET = 
DPCD_TRAINING_PATTERN_VIDEOIDLE;
dpcd_set_training_pattern(link, dpcd_pattern);
 
+   /* delay 5ms after notifying sink of idle pattern before switching 
output */
+   if (link->connector_signal != SIGNAL_TYPE_EDP)
+   msleep(5);
+
/* 4. mainlink output idle pattern*/
dp_set_hw_test_pattern(link, DP_TEST_PATTERN_VIDEO_MODE, NULL, 0);
 
@@ -1551,6 +1555,12 @@ bool perform_link_training_with_retries(
struct dc_link *link = stream->link;
enum dp_panel_mode panel_mode = dp_get_panel_mode(link);
 
+   /* We need to do this before the link training to ensure the idle 
pattern in SST
+* mode will be sent right after the link training
+*/
+   link->link_enc->funcs->connect_dig_be_to_fe(link->link_enc,
+   
pipe_ctx->stream_res.stream_enc->id, true);
+
for (j = 0; j < attempts; ++j) {
 
dp_enable_link_phy(
@@ -1567,12 +1577,6 @@ bool perform_link_training_with_retries(
 
dp_set_panel_mode(link, panel_mode);
 
-   /* We need to do this before the link training to ensure the 
idle pattern in SST
-* mode will be sent right after the link training
-*/
-   link->link_enc->funcs->connect_dig_be_to_fe(link->link_enc,
-   
pipe_ctx->stream_res.stream_enc->id, true);
-
if (link->aux_access_disabled) {
dc_link_dp_perform_link_training_skip_aux(link, 
link_setting);
return true;
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index b77e9dc160863..2af1d74d16ad8 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -1069,8 +1069,17 @@ void dce110_blank_stream(struct pipe_ctx *pipe_ctx)
link->dc->hwss.set_abm_immediate_disable(pipe_ctx);
}
 
-   if (dc_is_dp_signal(pipe_ctx->stream->signal))
+   if (dc_is_dp_signal(pipe_ctx->stream->signal)) {

pipe_ctx->stream_res.stream_enc->funcs->dp_blank(pipe_ctx->stream_res.stream_enc);
+
+   /*
+* After output is idle pattern some sinks need time to 
recognize the stream
+* has changed or they enter protection state and hang.
+*/
+   if (!dc_is_embedded_signal(pipe_ctx->stream->signal))
+   msleep(60);
+   }
+
 }
 
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.8 38/64] drm/amd/display: allow query ddc data over aux to be read only operation

2020-08-10 Thread Sasha Levin
From: Wenjing Liu 

[ Upstream commit 26b4750d6cf84cb2b3f0a84c9b345e7b71886410 ]

[why]
Two issues:
1. Add read only operation support for query ddc data over aux.
2. Fix a bug where if read size is multiple of 16,
mot of the last read transaction will not be set to 0.

Signed-off-by: Wenjing Liu 
Reviewed-by: Jun Lei 
Acked-by: Rodrigo Siqueira 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../gpu/drm/amd/display/dc/core/dc_link_ddc.c | 29 ---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
index aefd29a440b52..be8f265976b09 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
@@ -503,7 +503,7 @@ bool dal_ddc_service_query_ddc_data(
uint8_t *read_buf,
uint32_t read_size)
 {
-   bool ret = false;
+   bool success = true;
uint32_t payload_size =
dal_ddc_service_is_in_aux_transaction_mode(ddc) ?
DEFAULT_AUX_MAX_DATA_SIZE : EDID_SEGMENT_SIZE;
@@ -527,7 +527,6 @@ bool dal_ddc_service_query_ddc_data(
 *  but we want to read 256 over i2c*/
if (dal_ddc_service_is_in_aux_transaction_mode(ddc)) {
struct aux_payload payload;
-   bool read_available = true;
 
payload.i2c_over_aux = true;
payload.address = address;
@@ -536,21 +535,26 @@ bool dal_ddc_service_query_ddc_data(
 
if (write_size != 0) {
payload.write = true;
-   payload.mot = false;
+   /* should not set mot (middle of transaction) to 0
+* if there are pending read payloads
+*/
+   payload.mot = read_size == 0 ? false : true;
payload.length = write_size;
payload.data = write_buf;
 
-   ret = dal_ddc_submit_aux_command(ddc, );
-   read_available = ret;
+   success = dal_ddc_submit_aux_command(ddc, );
}
 
-   if (read_size != 0 && read_available) {
+   if (read_size != 0 && success) {
payload.write = false;
+   /* should set mot (middle of transaction) to 0
+* since it is the last payload to send
+*/
payload.mot = false;
payload.length = read_size;
payload.data = read_buf;
 
-   ret = dal_ddc_submit_aux_command(ddc, );
+   success = dal_ddc_submit_aux_command(ddc, );
}
} else {
struct i2c_command command = {0};
@@ -573,7 +577,7 @@ bool dal_ddc_service_query_ddc_data(
command.number_of_payloads =
dal_ddc_i2c_payloads_get_count();
 
-   ret = dm_helpers_submit_i2c(
+   success = dm_helpers_submit_i2c(
ddc->ctx,
ddc->link,
);
@@ -581,7 +585,7 @@ bool dal_ddc_service_query_ddc_data(
dal_ddc_i2c_payloads_destroy();
}
 
-   return ret;
+   return success;
 }
 
 bool dal_ddc_submit_aux_command(struct ddc_service *ddc,
@@ -598,7 +602,7 @@ bool dal_ddc_submit_aux_command(struct ddc_service *ddc,
 
do {
struct aux_payload current_payload;
-   bool is_end_of_payload = (retrieved + 
DEFAULT_AUX_MAX_DATA_SIZE) >
+   bool is_end_of_payload = (retrieved + 
DEFAULT_AUX_MAX_DATA_SIZE) >=
payload->length;
 
current_payload.address = payload->address;
@@ -607,7 +611,10 @@ bool dal_ddc_submit_aux_command(struct ddc_service *ddc,
current_payload.i2c_over_aux = payload->i2c_over_aux;
current_payload.length = is_end_of_payload ?
payload->length - retrieved : DEFAULT_AUX_MAX_DATA_SIZE;
-   current_payload.mot = !is_end_of_payload;
+   /* set mot (middle of transaction) to false
+* if it is the last payload
+*/
+   current_payload.mot = is_end_of_payload ? payload->mot:true;
current_payload.reply = payload->reply;
current_payload.write = payload->write;
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.8 48/64] drm/amd/powerplay: suppress compile error around BUG_ON

2020-08-10 Thread Sasha Levin
From: Evan Quan 

[ Upstream commit 75bc07e2403caea9ecac69f766dfb7dc33547594 ]

To suppress the compile error below for "ARCH=arc".
   drivers/gpu/drm/amd/amdgpu/../powerplay/arcturus_ppt.c: In function 
'arcturus_fill_eeprom_i2c_req':
>> arch/arc/include/asm/bug.h:22:2: error: implicit declaration of function 
>> 'pr_warn'; did you mean 'pci_warn'? [-Werror=implicit-function-declaration]
  22 |  pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, 
__func__); \
 |  ^~~
   include/asm-generic/bug.h:62:57: note: in expansion of macro 'BUG'
  62 | #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } 
while (0)
 | ^~~
   drivers/gpu/drm/amd/amdgpu/../powerplay/arcturus_ppt.c:2157:2: note: in 
expansion of macro 'BUG_ON'
2157 |  BUG_ON(numbytes > MAX_SW_I2C_COMMANDS);

Signed-off-by: Evan Quan 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c 
b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
index 27c5fc9572b27..e4630a76d7bf5 100644
--- a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c
@@ -2042,8 +2042,6 @@ static void arcturus_fill_eeprom_i2c_req(SwI2cRequest_t  
*req, bool write,
 {
int i;
 
-   BUG_ON(numbytes > MAX_SW_I2C_COMMANDS);
-
req->I2CcontrollerPort = 0;
req->I2CSpeed = 2;
req->SlaveAddress = address;
@@ -2081,6 +2079,12 @@ static int arcturus_i2c_eeprom_read_data(struct 
i2c_adapter *control,
struct smu_table_context *smu_table = >smu.smu_table;
struct smu_table *table = _table->driver_table;
 
+   if (numbytes > MAX_SW_I2C_COMMANDS) {
+   dev_err(adev->dev, "numbytes requested %d is over max allowed 
%d\n",
+   numbytes, MAX_SW_I2C_COMMANDS);
+   return -EINVAL;
+   }
+
memset(, 0, sizeof(req));
arcturus_fill_eeprom_i2c_req(, false, address, numbytes, data);
 
@@ -2117,6 +2121,12 @@ static int arcturus_i2c_eeprom_write_data(struct 
i2c_adapter *control,
SwI2cRequest_t req;
struct amdgpu_device *adev = to_amdgpu_device(control);
 
+   if (numbytes > MAX_SW_I2C_COMMANDS) {
+   dev_err(adev->dev, "numbytes requested %d is over max allowed 
%d\n",
+   numbytes, MAX_SW_I2C_COMMANDS);
+   return -EINVAL;
+   }
+
memset(, 0, sizeof(req));
arcturus_fill_eeprom_i2c_req(, true, address, numbytes, data);
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.8 39/64] drm/amd/powerplay: fix compile error with ARCH=arc

2020-08-10 Thread Sasha Levin
From: Evan Quan 

[ Upstream commit 9822ba2ead1baa3de4860ad9472f652c4cc78c9c ]

Fix the compile error below:
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c: In function 
'smu_v11_0_init_microcode':
>> arch/arc/include/asm/bug.h:22:2: error: implicit declaration of function 
>> 'pr_warn'; did you mean 'pci_warn'? [-Werror=implicit-function-declaration]
  22 |  pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, 
__func__); \
 |  ^~~
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:176:3: note: in expansion 
of macro 'BUG'
 176 |   BUG();

Reported-by: kernel test robot 
Signed-off-by: Evan Quan 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c 
b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
index aa76c2cea7471..7897be877b961 100644
--- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
@@ -164,7 +164,8 @@ int smu_v11_0_init_microcode(struct smu_context *smu)
chip_name = "navi12";
break;
default:
-   BUG();
+   dev_err(adev->dev, "Unsupported ASIC type %d\n", 
adev->asic_type);
+   return -EINVAL;
}
 
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_smc.bin", chip_name);
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.8 35/64] drm/amdgpu/debugfs: fix ref count leak when pm_runtime_get_sync fails

2020-08-10 Thread Sasha Levin
From: Alex Deucher 

[ Upstream commit 9eee152aab56d374edb9ad21b3db05f5cdda2fe6 ]

The call to pm_runtime_get_sync increments the counter even in case of
failure, leading to incorrect ref count.
In case of failure, decrement the ref count before returning.

Acked-by: Evan Quan 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 95 +++--
 1 file changed, 70 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index a414da22a359c..386b979e08522 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -223,12 +223,16 @@ static int  amdgpu_debugfs_process_reg_op(bool read, 
struct file *f,
*pos &= (1UL << 22) - 1;
 
r = pm_runtime_get_sync(adev->ddev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
r = amdgpu_virt_enable_access_debugfs(adev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
if (use_bank) {
if ((sh_bank != 0x && sh_bank >= 
adev->gfx.config.max_sh_per_se) ||
@@ -332,12 +336,16 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file 
*f, char __user *buf,
return -EINVAL;
 
r = pm_runtime_get_sync(adev->ddev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
r = amdgpu_virt_enable_access_debugfs(adev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
while (size) {
uint32_t value;
@@ -387,12 +395,16 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file 
*f, const char __user
return -EINVAL;
 
r = pm_runtime_get_sync(adev->ddev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
r = amdgpu_virt_enable_access_debugfs(adev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
while (size) {
uint32_t value;
@@ -443,12 +455,16 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file 
*f, char __user *buf,
return -EINVAL;
 
r = pm_runtime_get_sync(adev->ddev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
r = amdgpu_virt_enable_access_debugfs(adev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
while (size) {
uint32_t value;
@@ -498,12 +514,16 @@ static ssize_t amdgpu_debugfs_regs_didt_write(struct file 
*f, const char __user
return -EINVAL;
 
r = pm_runtime_get_sync(adev->ddev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
r = amdgpu_virt_enable_access_debugfs(adev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
while (size) {
uint32_t value;
@@ -554,12 +574,16 @@ static ssize_t amdgpu_debugfs_regs_smc_read(struct file 
*f, char __user *buf,
return -EINVAL;
 
r = pm_runtime_get_sync(adev->ddev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
r = amdgpu_virt_enable_access_debugfs(adev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
while (size) {
uint32_t value;
@@ -609,12 +633,16 @@ static ssize_t amdgpu_debugfs_regs_smc_write(struct file 
*f, const char __user *
return -EINVAL;
 
r = pm_runtime_get_sync(adev->ddev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
r = amdgpu_virt_enable_access_debugfs(adev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev->ddev->dev);
return r;
+   }
 
while (size) {
uint32_t value;
@@ -764,12 +792,16 @@ static ssize_t amdgpu_debugfs_sensor_read(struct file *f, 
char __user *buf,
valuesize = sizeof(values);
 
r = pm_runtime_get_sync(adev->ddev->dev);
-   if (r < 0)
+   if (r < 0) {
+ 

[PATCH AUTOSEL 5.8 10/64] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync

2020-08-10 Thread Sasha Levin
From: Aditya Pakki 

[ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ]

On calling pm_runtime_get_sync() the reference count of the device
is incremented. In case of failure, decrement the
reference count before returning the error.

Acked-by: Evan Quan 
Signed-off-by: Aditya Pakki 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_display.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_drv.c | 4 +++-
 drivers/gpu/drm/radeon/radeon_kms.c | 4 +++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index 35db79a168bf7..df1a7eb736517 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -635,8 +635,10 @@ radeon_crtc_set_config(struct drm_mode_set *set,
dev = set->crtc->dev;
 
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_crtc_helper_set_config(set, ctx);
 
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index bbb0883e8ce6a..62b5069122cc4 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -549,8 +549,10 @@ long radeon_drm_ioctl(struct file *filp,
long ret;
dev = file_priv->minor->dev;
ret = pm_runtime_get_sync(dev->dev);
-   if (ret < 0)
+   if (ret < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return ret;
+   }
 
ret = drm_ioctl(filp, cmd, arg);

diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
b/drivers/gpu/drm/radeon/radeon_kms.c
index c5d1dc9618a40..99ee60f8b604d 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -638,8 +638,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct 
drm_file *file_priv)
file_priv->driver_priv = NULL;
 
r = pm_runtime_get_sync(dev->dev);
-   if (r < 0)
+   if (r < 0) {
+   pm_runtime_put_autosuspend(dev->dev);
return r;
+   }
 
/* new gpu have virtual address space support */
if (rdev->family >= CHIP_CAYMAN) {
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.8 30/64] drm/radeon: disable AGP by default

2020-08-10 Thread Sasha Levin
From: Christian König 

[ Upstream commit ba806f98f868ce107aa9c453fef751de9980e4af ]

Always use the PCI GART instead. We just have to many cases
where AGP still causes problems. This means a performance
regression for some GPUs, but also a bug fix for some others.

Signed-off-by: Christian König 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/radeon/radeon_drv.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 62b5069122cc4..4cd30613fa1dd 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -171,12 +171,7 @@ int radeon_no_wb;
 int radeon_modeset = -1;
 int radeon_dynclks = -1;
 int radeon_r4xx_atom = 0;
-#ifdef __powerpc__
-/* Default to PCI on PowerPC (fdo #95017) */
 int radeon_agpmode = -1;
-#else
-int radeon_agpmode = 0;
-#endif
 int radeon_vram_limit = 0;
 int radeon_gart_size = -1; /* auto */
 int radeon_benchmarking = 0;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH AUTOSEL 5.8 09/64] drm/amdgpu: avoid dereferencing a NULL pointer

2020-08-10 Thread Sasha Levin
From: Jack Xiao 

[ Upstream commit 55611b507fd6453d26030c0c0619fdf0c262766d ]

Check if irq_src is NULL to avoid dereferencing a NULL pointer,
for MES ring is uneccessary to recieve an interrupt notification.

Signed-off-by: Jack Xiao 
Acked-by: Alex Deucher 
Reviewed-by: Hawking Zhang 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index d878fe7fee51c..3414e119f0cbf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -416,7 +416,9 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,
ring->fence_drv.gpu_addr = adev->uvd.inst[ring->me].gpu_addr + 
index;
}
amdgpu_fence_write(ring, atomic_read(>fence_drv.last_seq));
-   amdgpu_irq_get(adev, irq_src, irq_type);
+
+   if (irq_src)
+   amdgpu_irq_get(adev, irq_src, irq_type);
 
ring->fence_drv.irq_src = irq_src;
ring->fence_drv.irq_type = irq_type;
@@ -537,8 +539,9 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev)
/* no need to trigger GPU reset as we are unloading */
amdgpu_fence_driver_force_completion(ring);
}
-   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
-  ring->fence_drv.irq_type);
+   if (ring->fence_drv.irq_src)
+   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
+  ring->fence_drv.irq_type);
drm_sched_fini(>sched);
del_timer_sync(>fence_drv.fallback_timer);
for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
@@ -574,8 +577,9 @@ void amdgpu_fence_driver_suspend(struct amdgpu_device *adev)
}
 
/* disable the interrupt */
-   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
-  ring->fence_drv.irq_type);
+   if (ring->fence_drv.irq_src)
+   amdgpu_irq_put(adev, ring->fence_drv.irq_src,
+  ring->fence_drv.irq_type);
}
 }
 
@@ -601,8 +605,9 @@ void amdgpu_fence_driver_resume(struct amdgpu_device *adev)
continue;
 
/* enable the interrupt */
-   amdgpu_irq_get(adev, ring->fence_drv.irq_src,
-  ring->fence_drv.irq_type);
+   if (ring->fence_drv.irq_src)
+   amdgpu_irq_get(adev, ring->fence_drv.irq_src,
+  ring->fence_drv.irq_type);
}
 }
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm: amdgpu: Use the correct size when allocating memory

2020-08-10 Thread Marion & Christophe JAILLET


Le 10/08/2020 à 17:42, Dan Carpenter a écrit :

On Sun, Aug 09, 2020 at 10:34:06PM +0200, Christophe JAILLET wrote:

When '*sgt' is allocated, we must allocated 'sizeof(**sgt)' bytes instead
of 'sizeof(*sg)'. 'sg' (i.e. struct scatterlist) is smaller than
'sgt' (i.e struct sg_table), so this could lead to memory corruption.

The sizeof(*sg) is bigger than sizeof(**sgt) so this wastes memory but
it won't lead to corruption.

 11  struct scatterlist {
 12  unsigned long   page_link;
 13  unsigned intoffset;
 14  unsigned intlength;
 15  dma_addr_t  dma_address;
 16  #ifdef CONFIG_NEED_SG_DMA_LENGTH
 17  unsigned intdma_length;
 18  #endif
 19  };

 42  struct sg_table {
 43  struct scatterlist *sgl;/* the list */
 44  unsigned int nents; /* number of mapped entries */
 45  unsigned int orig_nents;/* original size of list */
 46  };

regards,
dan carpenter



My bad. I read 'struct scatterlist sgl' (without the *)
Thanks for the follow-up, Dan.

Doesn't smatch catch such mismatch?
(I've not run smatch for a while, so it is maybe reported)

Well, the proposal is still valid, even if it has less impact as 
initially thought.


Thx for the review.

CJ

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu/display: drop unused function

2020-08-10 Thread Nirmoy

Acked-by: Nirmoy Das 

On 8/10/20 5:56 PM, Alex Deucher wrote:

This is not longer used as of the latest rework of this
code so drop it to avoid a unused function warning.

Signed-off-by: Alex Deucher 
---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 18 --
  1 file changed, 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 0f9d87773fa3..ff5f7f7ceec6 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2827,24 +2827,6 @@ dm_atomic_get_new_state(struct drm_atomic_state *state)
return NULL;
  }
  
-static struct dm_atomic_state *

-dm_atomic_get_old_state(struct drm_atomic_state *state)
-{
-   struct drm_device *dev = state->dev;
-   struct amdgpu_device *adev = dev->dev_private;
-   struct amdgpu_display_manager *dm = >dm;
-   struct drm_private_obj *obj;
-   struct drm_private_state *old_obj_state;
-   int i;
-
-   for_each_old_private_obj_in_state(state, obj, old_obj_state, i) {
-   if (obj->funcs == dm->atomic_obj.funcs)
-   return to_dm_atomic_state(old_obj_state);
-   }
-
-   return NULL;
-}
-
  static struct drm_private_state *
  dm_atomic_duplicate_state(struct drm_private_obj *obj)
  {

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu/display: drop unused function

2020-08-10 Thread Alex Deucher
This is not longer used as of the latest rework of this
code so drop it to avoid a unused function warning.

Signed-off-by: Alex Deucher 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 18 --
 1 file changed, 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 0f9d87773fa3..ff5f7f7ceec6 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2827,24 +2827,6 @@ dm_atomic_get_new_state(struct drm_atomic_state *state)
return NULL;
 }
 
-static struct dm_atomic_state *
-dm_atomic_get_old_state(struct drm_atomic_state *state)
-{
-   struct drm_device *dev = state->dev;
-   struct amdgpu_device *adev = dev->dev_private;
-   struct amdgpu_display_manager *dm = >dm;
-   struct drm_private_obj *obj;
-   struct drm_private_state *old_obj_state;
-   int i;
-
-   for_each_old_private_obj_in_state(state, obj, old_obj_state, i) {
-   if (obj->funcs == dm->atomic_obj.funcs)
-   return to_dm_atomic_state(old_obj_state);
-   }
-
-   return NULL;
-}
-
 static struct drm_private_state *
 dm_atomic_duplicate_state(struct drm_private_obj *obj)
 {
-- 
2.25.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/1] drm/amdgpu/display: use GFP_ATOMIC in dcn20_validate_bandwidth_internal

2020-08-10 Thread Daniel Kolesa
On Mon, Aug 10, 2020, at 17:38, Alex Deucher wrote:
> On Sat, Aug 8, 2020 at 4:51 PM Daniel Kolesa  wrote:
> >
> > GFP_KERNEL may and will sleep, and this is being executed in
> > a non-preemptible context; this will mess things up since it's
> > called inbetween DC_FP_START/END, and rescheduling will result
> > in the DC_FP_END later being called in a different context (or
> > just crashing if any floating point/vector registers/instructions
> > are used after the call is resumed in a different context).
> >
> > Signed-off-by: Daniel Kolesa 
> 
> We should probably find a way to pre-allocate this, but in the
> meantime, I'll apply the patch.

Indeed. But this should work as an immediate solution for people experiencing 
issues. For me it completely stopped dmesg noise on ppc64le and aarch64, and 
seemingly even fixed the screen flicker problem on POWER that I had to formerly 
work around by forcing the clocks to high (been running it for days and it's 
worked perfectly fine with adaptive during the time).

Daniel

> 
> Thanks!
> 
> Alex
> 
> 
> > ---
> >  drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
> > b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> > index 991eddd10952..c31d1f30e505 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> > +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> > @@ -3141,7 +3141,7 @@ static bool dcn20_validate_bandwidth_internal(struct 
> > dc *dc, struct dc_state *co
> > int vlevel = 0;
> > int pipe_split_from[MAX_PIPES];
> > int pipe_cnt = 0;
> > -   display_e2e_pipe_params_st *pipes = 
> > kzalloc(dc->res_pool->pipe_count * sizeof(display_e2e_pipe_params_st), 
> > GFP_KERNEL);
> > +   display_e2e_pipe_params_st *pipes = 
> > kzalloc(dc->res_pool->pipe_count * sizeof(display_e2e_pipe_params_st), 
> > GFP_ATOMIC);
> > DC_LOGGER_INIT(dc->ctx->logger);
> >
> > BW_VAL_TRACE_COUNT();
> > --
> > 2.28.0
> >
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm: amdgpu: Use the correct size when allocating memory

2020-08-10 Thread Dan Carpenter
On Sun, Aug 09, 2020 at 10:34:06PM +0200, Christophe JAILLET wrote:
> When '*sgt' is allocated, we must allocated 'sizeof(**sgt)' bytes instead
> of 'sizeof(*sg)'. 'sg' (i.e. struct scatterlist) is smaller than
> 'sgt' (i.e struct sg_table), so this could lead to memory corruption.

The sizeof(*sg) is bigger than sizeof(**sgt) so this wastes memory but
it won't lead to corruption.

11  struct scatterlist {
12  unsigned long   page_link;
13  unsigned intoffset;
14  unsigned intlength;
15  dma_addr_t  dma_address;
16  #ifdef CONFIG_NEED_SG_DMA_LENGTH
17  unsigned intdma_length;
18  #endif
19  };

42  struct sg_table {
43  struct scatterlist *sgl;/* the list */
44  unsigned int nents; /* number of mapped entries */
45  unsigned int orig_nents;/* original size of list */
46  };

regards,
dan carpenter

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: add DCN support for aarch64

2020-08-10 Thread Alex Deucher
Applied.  Thanks!

Alex

On Sat, Aug 8, 2020 at 4:51 PM Daniel Kolesa  wrote:
>
> This adds ARM64 support into the DCN. This mainly enables support
> for Navi graphics cards. The dcn10 changes haven't been tested,
> since I don't have the relevant hardware available, but there
> is no way to conditionally disable them, so I've done them anyway.
>
> Signed-off-by: Daniel Kolesa 
> ---
>  drivers/gpu/drm/amd/display/Kconfig   |  2 +-
>  drivers/gpu/drm/amd/display/dc/calcs/Makefile |  7 ++
>  .../gpu/drm/amd/display/dc/clk_mgr/Makefile   |  7 ++
>  drivers/gpu/drm/amd/display/dc/dcn10/Makefile |  7 ++
>  .../drm/amd/display/dc/dcn10/dcn10_resource.c | 81 ---
>  drivers/gpu/drm/amd/display/dc/dcn20/Makefile |  4 +
>  drivers/gpu/drm/amd/display/dc/dcn21/Makefile |  4 +
>  drivers/gpu/drm/amd/display/dc/dml/Makefile   | 13 +++
>  drivers/gpu/drm/amd/display/dc/dsc/Makefile   |  5 ++
>  drivers/gpu/drm/amd/display/dc/os_types.h |  4 +
>  10 files changed, 102 insertions(+), 32 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/Kconfig 
> b/drivers/gpu/drm/amd/display/Kconfig
> index 77569097a480..f24abf428534 100644
> --- a/drivers/gpu/drm/amd/display/Kconfig
> +++ b/drivers/gpu/drm/amd/display/Kconfig
> @@ -6,7 +6,7 @@ config DRM_AMD_DC
> bool "AMD DC - Enable new display engine"
> default y
> select SND_HDA_COMPONENT if SND_HDA_CORE
> -   select DRM_AMD_DC_DCN if (X86 || PPC64) && !(KCOV_INSTRUMENT_ALL && 
> KCOV_ENABLE_COMPARISONS)
> +   select DRM_AMD_DC_DCN if (X86 || PPC64 || (ARM64 && 
> KERNEL_MODE_NEON)) && !(KCOV_INSTRUMENT_ALL && KCOV_ENABLE_COMPARISONS)
> help
>   Choose this option if you want to use the new display engine
>   support for AMDGPU. This adds required support for Vega and
> diff --git a/drivers/gpu/drm/amd/display/dc/calcs/Makefile 
> b/drivers/gpu/drm/amd/display/dc/calcs/Makefile
> index 4674aca8f206..64f515d74410 100644
> --- a/drivers/gpu/drm/amd/display/dc/calcs/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/calcs/Makefile
> @@ -33,6 +33,10 @@ ifdef CONFIG_PPC64
>  calcs_ccflags := -mhard-float -maltivec
>  endif
>
> +ifdef CONFIG_ARM64
> +calcs_rcflags := -mgeneral-regs-only
> +endif
> +
>  ifdef CONFIG_CC_IS_GCC
>  ifeq ($(call cc-ifversion, -lt, 0701, y), y)
>  IS_OLD_GCC = 1
> @@ -53,6 +57,9 @@ endif
>  CFLAGS_$(AMDDALPATH)/dc/calcs/dcn_calcs.o := $(calcs_ccflags)
>  CFLAGS_$(AMDDALPATH)/dc/calcs/dcn_calc_auto.o := $(calcs_ccflags)
>  CFLAGS_$(AMDDALPATH)/dc/calcs/dcn_calc_math.o := $(calcs_ccflags) 
> -Wno-tautological-compare
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/calcs/dcn_calcs.o := $(calcs_rcflags)
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/calcs/dcn_calc_auto.o := $(calcs_rcflags)
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/calcs/dcn_calc_math.o := $(calcs_rcflags)
>
>  BW_CALCS = dce_calcs.o bw_fixed.o custom_float.o
>
> diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile 
> b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> index 52b1ce775a1e..1a495759a034 100644
> --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
> @@ -104,6 +104,13 @@ ifdef CONFIG_PPC64
>  CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn21/rn_clk_mgr.o := $(call 
> cc-option,-mno-gnu-attribute)
>  endif
>
> +# prevent build errors:
> +# ...: '-mgeneral-regs-only' is incompatible with the use of floating-point 
> types
> +# this file is unused on arm64, just like on ppc64
> +ifdef CONFIG_ARM64
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/clk_mgr/dcn21/rn_clk_mgr.o := 
> -mgeneral-regs-only
> +endif
> +
>  AMD_DAL_CLK_MGR_DCN21 = $(addprefix 
> $(AMDDALPATH)/dc/clk_mgr/dcn21/,$(CLK_MGR_DCN21))
>
>  AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN21)
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/Makefile 
> b/drivers/gpu/drm/amd/display/dc/dcn10/Makefile
> index 62ad1a11bff9..733e6e6e43bd 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn10/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/dcn10/Makefile
> @@ -31,4 +31,11 @@ DCN10 = dcn10_init.o dcn10_resource.o dcn10_ipp.o 
> dcn10_hw_sequencer.o \
>
>  AMD_DAL_DCN10 = $(addprefix $(AMDDALPATH)/dc/dcn10/,$(DCN10))
>
> +# fix:
> +# ...: '-mgeneral-regs-only' is incompatible with the use of floating-point 
> types
> +# aarch64 does not support soft-float, so use hard-float and handle this in 
> code
> +ifdef CONFIG_ARM64
> +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dcn10/dcn10_resource.o := -mgeneral-regs-only
> +endif
> +
>  AMD_DISPLAY_FILES += $(AMD_DAL_DCN10)
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c 
> b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
> index 17d5cb422025..07571f84e0f8 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
> @@ -1331,6 +1331,47 @@ static uint32_t read_pipe_fuses(struct dc_context *ctx)
> return value;
>  }
>
> +/*
> + * Some architectures don't support soft-float (e.g. aarch64), on those
> + * this 

Re: [PATCH 1/1] drm/amdgpu/display: use GFP_ATOMIC in dcn20_validate_bandwidth_internal

2020-08-10 Thread Alex Deucher
On Sat, Aug 8, 2020 at 4:51 PM Daniel Kolesa  wrote:
>
> GFP_KERNEL may and will sleep, and this is being executed in
> a non-preemptible context; this will mess things up since it's
> called inbetween DC_FP_START/END, and rescheduling will result
> in the DC_FP_END later being called in a different context (or
> just crashing if any floating point/vector registers/instructions
> are used after the call is resumed in a different context).
>
> Signed-off-by: Daniel Kolesa 

We should probably find a way to pre-allocate this, but in the
meantime, I'll apply the patch.

Thanks!

Alex


> ---
>  drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
> b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> index 991eddd10952..c31d1f30e505 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> @@ -3141,7 +3141,7 @@ static bool dcn20_validate_bandwidth_internal(struct dc 
> *dc, struct dc_state *co
> int vlevel = 0;
> int pipe_split_from[MAX_PIPES];
> int pipe_cnt = 0;
> -   display_e2e_pipe_params_st *pipes = kzalloc(dc->res_pool->pipe_count 
> * sizeof(display_e2e_pipe_params_st), GFP_KERNEL);
> +   display_e2e_pipe_params_st *pipes = kzalloc(dc->res_pool->pipe_count 
> * sizeof(display_e2e_pipe_params_st), GFP_ATOMIC);
> DC_LOGGER_INIT(dc->ctx->logger);
>
> BW_VAL_TRACE_COUNT();
> --
> 2.28.0
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 6/8] drm/amd/display: Set DC options from modifiers.

2020-08-10 Thread Daniel Vetter
On Mon, Aug 10, 2020 at 4:13 PM Bas Nieuwenhuizen
 wrote:
>
> On Mon, Aug 10, 2020 at 3:09 PM Daniel Vetter  wrote:
> >
> > On Mon, Aug 10, 2020 at 02:49:00PM +0200, Michel Dänzer wrote:
> > > On 2020-08-10 2:28 p.m., Daniel Vetter wrote:
> > > >
> > > > Ok just learned that amdgpu hat set/get_tiling, so I'm upgrading my idea
> > > > here to a very strong recommendation, i.e. please do this except if
> > > > there's and amd ddx which somehow wants to change tiling mode while a fb
> > > > exists, and expects this to propagate.
> > > >
> > > > In i915 we even disallow the set_tiling ioctl with an error if an fb
> > > > exists, just to make sure userspace behaves. Even if userspace uses
> > > > set_tiling, this way we can at least enforce the same semantics of 
> > > > "client
> > > > can't pull compositor over the table with a set_tiling at the wrong 
> > > > time"
> > > > of modifiers.
> > >
> > > FWIW, xf86-video-amdgpu doesn't have any code to set the tiling
> > > metadata, only Mesa and presumably AMD's Vulkan/OpenGL UMDs do.
> >
> > Ah right you do everything with glamour, so this should never show up as a
> > problem.
>
> I think it is a good idea to do so, but cannot do it completely in
> this series as we don't define modifiers for GFX6-GFX8 GPU generations
> yet. (wanted to leave these out for a bit to reduce the scope for the
> initial version)

Hm right, that makes it a bit awkward.

> That said, there is a series that captures the tiling flags on FB
> creation: https://patchwork.freedesktop.org/series/80109/

Yeah, but it only pushes it down into the state objects. Good first
step, what I'm proposing is to push it all the way into addfb/struct
drm_framebuffer. Since drm_framebuffer is also an invariant thing it
makes the most sense to keep that there. I'm also discussing with
Nicholas about what would be the ideal end state.

But yeah maybe one thing at time.
-Daniel

>
> > -Daniel
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 6/8] drm/amd/display: Set DC options from modifiers.

2020-08-10 Thread Bas Nieuwenhuizen
On Mon, Aug 10, 2020 at 3:09 PM Daniel Vetter  wrote:
>
> On Mon, Aug 10, 2020 at 02:49:00PM +0200, Michel Dänzer wrote:
> > On 2020-08-10 2:28 p.m., Daniel Vetter wrote:
> > >
> > > Ok just learned that amdgpu hat set/get_tiling, so I'm upgrading my idea
> > > here to a very strong recommendation, i.e. please do this except if
> > > there's and amd ddx which somehow wants to change tiling mode while a fb
> > > exists, and expects this to propagate.
> > >
> > > In i915 we even disallow the set_tiling ioctl with an error if an fb
> > > exists, just to make sure userspace behaves. Even if userspace uses
> > > set_tiling, this way we can at least enforce the same semantics of "client
> > > can't pull compositor over the table with a set_tiling at the wrong time"
> > > of modifiers.
> >
> > FWIW, xf86-video-amdgpu doesn't have any code to set the tiling
> > metadata, only Mesa and presumably AMD's Vulkan/OpenGL UMDs do.
>
> Ah right you do everything with glamour, so this should never show up as a
> problem.

I think it is a good idea to do so, but cannot do it completely in
this series as we don't define modifiers for GFX6-GFX8 GPU generations
yet. (wanted to leave these out for a bit to reduce the scope for the
initial version)

That said, there is a series that captures the tiling flags on FB
creation: https://patchwork.freedesktop.org/series/80109/

> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 6/8] drm/amd/display: Set DC options from modifiers.

2020-08-10 Thread Daniel Vetter
On Mon, Aug 10, 2020 at 02:49:00PM +0200, Michel Dänzer wrote:
> On 2020-08-10 2:28 p.m., Daniel Vetter wrote:
> >
> > Ok just learned that amdgpu hat set/get_tiling, so I'm upgrading my idea
> > here to a very strong recommendation, i.e. please do this except if
> > there's and amd ddx which somehow wants to change tiling mode while a fb
> > exists, and expects this to propagate.
> > 
> > In i915 we even disallow the set_tiling ioctl with an error if an fb
> > exists, just to make sure userspace behaves. Even if userspace uses
> > set_tiling, this way we can at least enforce the same semantics of "client
> > can't pull compositor over the table with a set_tiling at the wrong time"
> > of modifiers.
> 
> FWIW, xf86-video-amdgpu doesn't have any code to set the tiling
> metadata, only Mesa and presumably AMD's Vulkan/OpenGL UMDs do.

Ah right you do everything with glamour, so this should never show up as a
problem.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] dma-buf.rst: repair length of title underline

2020-08-10 Thread Daniel Vetter
On Mon, Aug 10, 2020 at 01:25:40PM +0200, Christian König wrote:
> Am 09.08.20 um 08:17 schrieb Lukas Bulwahn:
> > With commit 72b6ede73623 ("dma-buf.rst: Document why indefinite fences are
> > a bad idea"), document generation warns:
> > 
> >Documentation/driver-api/dma-buf.rst:182: \
> >WARNING: Title underline too short.
> > 
> > Repair length of title underline to remove warning.
> > 
> > Fixes: 72b6ede73623 ("dma-buf.rst: Document why indefinite fences are a bad 
> > idea")
> > Signed-off-by: Lukas Bulwahn 
> 
> Acked-by: Christian König 
> 
> Should I pick it up into drm-misc-next?

Yes please. For the future if you need to check if someone has commit
rights and can push themselves:

https://people.freedesktop.org/~seanpaul/whomisc.html

Yeah with gitlab this would all be a bit more reasonable, but we get by
meanwhile :-)

Cheers, Daniel
> 
> > ---
> > Daniel, please pick this minor non-urgent fix to your new documentation.
> > 
> >   Documentation/driver-api/dma-buf.rst | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/Documentation/driver-api/dma-buf.rst 
> > b/Documentation/driver-api/dma-buf.rst
> > index 100bfd227265..13ea0cc0a3fa 100644
> > --- a/Documentation/driver-api/dma-buf.rst
> > +++ b/Documentation/driver-api/dma-buf.rst
> > @@ -179,7 +179,7 @@ DMA Fence uABI/Sync File
> >  :internal:
> >   Indefinite DMA Fences
> > -
> > +~
> >   At various times _fence with an indefinite time until dma_fence_wait()
> >   finishes have been proposed. Examples include:
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: convert to use le16_add_cpu()

2020-08-10 Thread Qinglang Miao
Convert cpu_to_le16(le16_to_cpu(E1) + E2) to use le16_add_cpu().

Signed-off-by: Qinglang Miao 
---
 drivers/gpu/drm/amd/display/dc/bios/command_table.c  | 4 +---
 drivers/gpu/drm/amd/display/dc/bios/command_table2.c | 5 +
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table.c 
b/drivers/gpu/drm/amd/display/dc/bios/command_table.c
index 5815983ca..070459e3e 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/command_table.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/command_table.c
@@ -1877,9 +1877,7 @@ static enum bp_result set_crtc_using_dtd_timing_v3(
 * but it is 4 either from Edid data (spec CEA 861)
 * or CEA timing table.
 */
-   params.usV_SyncOffset =
-   
cpu_to_le16(le16_to_cpu(params.usV_SyncOffset) + 1);
-
+   le16_add_cpu(_SyncOffset, 1);
}
}
 
diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c 
b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
index bed91572f..e8f52eb8e 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
@@ -569,10 +569,7 @@ static enum bp_result set_crtc_using_dtd_timing_v3(
 * but it is 4 either from Edid data (spec CEA 861)
 * or CEA timing table.
 */
-   params.v_syncoffset =
-   cpu_to_le16(le16_to_cpu(params.v_syncoffset) +
-   1);
-
+   le16_add_cpu(_syncoffset, 1);
}
}
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 6/8] drm/amd/display: Set DC options from modifiers.

2020-08-10 Thread Michel Dänzer
On 2020-08-10 2:28 p.m., Daniel Vetter wrote:
>
> Ok just learned that amdgpu hat set/get_tiling, so I'm upgrading my idea
> here to a very strong recommendation, i.e. please do this except if
> there's and amd ddx which somehow wants to change tiling mode while a fb
> exists, and expects this to propagate.
> 
> In i915 we even disallow the set_tiling ioctl with an error if an fb
> exists, just to make sure userspace behaves. Even if userspace uses
> set_tiling, this way we can at least enforce the same semantics of "client
> can't pull compositor over the table with a set_tiling at the wrong time"
> of modifiers.

FWIW, xf86-video-amdgpu doesn't have any code to set the tiling
metadata, only Mesa and presumably AMD's Vulkan/OpenGL UMDs do.


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 7/7] drm/amd/display: Replace DRM private objects with subclassed DRM atomic state

2020-08-10 Thread Daniel Vetter
On Fri, Aug 07, 2020 at 10:32:11AM -0400, Kazlauskas, Nicholas wrote:
> On 2020-08-07 4:52 a.m., dan...@ffwll.ch wrote:
> > On Thu, Jul 30, 2020 at 04:36:42PM -0400, Nicholas Kazlauskas wrote:
> > > @@ -440,7 +431,7 @@ struct dm_crtc_state {
> > >   #define to_dm_crtc_state(x) container_of(x, struct dm_crtc_state, base)
> > >   struct dm_atomic_state {
> > > - struct drm_private_state base;
> > > + struct drm_atomic_state base;
> > >   struct dc_state *context;
> > 
> > Also curiosity: Can't we just embed dc_state here, instead of a pointer?
> > Then it would become a lot more obvious that mostly this is a state object
> > container like drm_atomic_state, but for the DC specific state structures.
> > And we could look into moving the actual DC states into drm private states
> > perhaps (if that helps with the code structure and overall flow).
> > 
> > Maybe as next steps.
> > -Daniel
> > 
> 
> It's the refcounting that's the problem with this stuff. I'd like to move DC
> to a model where we have no memory allocation/ownership but that might be a
> bit of a more long term plan at this point.
> 
> Same with dc_plane_state and dc_stream_state as well - these could exist on
> the DRM objects as long as they're not refcounted.

Hm what's the refcounting problem you're having? Or is it the lack of
refcounting, and dc having different ideas about lifetimes than atomic?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/7] drm/amd/display: Avoid using unvalidated tiling_flags and tmz_surface in prepare_planes

2020-08-10 Thread Daniel Vetter
On Mon, Aug 10, 2020 at 02:30:05PM +0200, Christian König wrote:
> Am 10.08.20 um 14:25 schrieb Daniel Vetter:
> > On Fri, Aug 07, 2020 at 10:29:09AM -0400, Kazlauskas, Nicholas wrote:
> > > On 2020-08-07 4:30 a.m., dan...@ffwll.ch wrote:
> > > > On Thu, Jul 30, 2020 at 04:36:38PM -0400, Nicholas Kazlauskas wrote:
> > > > > [Why]
> > > > > We're racing with userspace as the flags could potentially change
> > > > > from when we acquired and validated them in commit_check.
> > > > Uh ... I didn't know these could change. I think my comments on Bas'
> > > > series are even more relevant now. I think long term would be best to 
> > > > bake
> > > > these flags in at addfb time when modifiers aren't set. And otherwise
> > > > always use the modifiers flag, and completely ignore the legacy flags
> > > > here.
> > > > -Daniel
> > > > 
> > > There's a set tiling/mod flags IOCTL that can be called after addfb 
> > > happens,
> > > so unless there's some sort of driver magic preventing this from working
> > > when it's already been pinned for scanout then I don't see anything 
> > > stopping
> > > this from happening.
> > > 
> > > I still need to review the modifiers series in a little more detail but 
> > > that
> > > looks like a good approach to fixing these kind of issues.
> > Yeah we had the same model for i915, but it's awkward and can surprise
> > compositors (since the client could change the tiling mode from underneath
> > the compositor). So freezing the tiling mode at addfb time is the right
> > thing to do, and anyway how things work with modifiers.
> > 
> > Ofc maybe good to audit the -amd driver, but hopefully it doesn't do
> > anything silly with changed tiling. If it does, it's kinda sad day.
> 
> Marek should know this right away, but I think we only set the tilling flags
> once while exporting the BO and then never change them.

Yeah it's the only reasonable model really, since set/get_tiling is not
synchronized with any winsys protocol. So full of races by design.

The only thing I'd worry about is if you do set_tiling after addfb in your
ddx. That one is save, but would badly break if we sample modifiers from
set_tiling flags only once at addfb time.
-Daniel

> 
> Regards,
> Christian.
> 
> > 
> > Btw for i915 we even went a step further, and made the set_tiling ioctl
> > return an error if a framebuffer for that gem_bo existed. Just to make
> > sure we don't ever accidentally break this.
> > 
> > Cheers, Daniel
> > 
> > > Regards,
> > > Nicholas Kazlauskas
> > > 
> > > > > [How]
> > > > > We unfortunately can't drop this function in its entirety from
> > > > > prepare_planes since we don't know the afb->address at commit_check
> > > > > time yet.
> > > > > 
> > > > > So instead of querying new tiling_flags and tmz_surface use the ones
> > > > > from the plane_state directly.
> > > > > 
> > > > > While we're at it, also update the force_disable_dcc option based
> > > > > on the state from atomic check.
> > > > > 
> > > > > Cc: Bhawanpreet Lakha 
> > > > > Cc: Rodrigo Siqueira 
> > > > > Signed-off-by: Nicholas Kazlauskas 
> > > > > ---
> > > > >.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 
> > > > > ++-
> > > > >1 file changed, 19 insertions(+), 17 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> > > > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > > index bf1881bd492c..f78c09c9585e 100644
> > > > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > > > @@ -5794,14 +5794,8 @@ static int dm_plane_helper_prepare_fb(struct 
> > > > > drm_plane *plane,
> > > > >   struct list_head list;
> > > > >   struct ttm_validate_buffer tv;
> > > > >   struct ww_acquire_ctx ticket;
> > > > > - uint64_t tiling_flags;
> > > > >   uint32_t domain;
> > > > >   int r;
> > > > > - bool tmz_surface = false;
> > > > > - bool force_disable_dcc = false;
> > > > > -
> > > > > - dm_plane_state_old = to_dm_plane_state(plane->state);
> > > > > - dm_plane_state_new = to_dm_plane_state(new_state);
> > > > >   if (!new_state->fb) {
> > > > >   DRM_DEBUG_DRIVER("No FB bound\n");
> > > > > @@ -5845,27 +5839,35 @@ static int dm_plane_helper_prepare_fb(struct 
> > > > > drm_plane *plane,
> > > > >   return r;
> > > > >   }
> > > > > - amdgpu_bo_get_tiling_flags(rbo, _flags);
> > > > > -
> > > > > - tmz_surface = amdgpu_bo_encrypted(rbo);
> > > > > -
> > > > >   ttm_eu_backoff_reservation(, );
> > > > >   afb->address = amdgpu_bo_gpu_offset(rbo);
> > > > >   amdgpu_bo_ref(rbo);
> > > > > + /**
> > > > > +  * We don't do surface updates on planes that have been newly 
> > > > > created,
> > > > > +  * but we also don't have the afb->address during atomic check.
> > > > > +  *
> > > > > +  * Fill in buffer attributes depending on the address here, but 
> 

Re: [PATCH 3/7] drm/amd/display: Avoid using unvalidated tiling_flags and tmz_surface in prepare_planes

2020-08-10 Thread Christian König

Am 10.08.20 um 14:25 schrieb Daniel Vetter:

On Fri, Aug 07, 2020 at 10:29:09AM -0400, Kazlauskas, Nicholas wrote:

On 2020-08-07 4:30 a.m., dan...@ffwll.ch wrote:

On Thu, Jul 30, 2020 at 04:36:38PM -0400, Nicholas Kazlauskas wrote:

[Why]
We're racing with userspace as the flags could potentially change
from when we acquired and validated them in commit_check.

Uh ... I didn't know these could change. I think my comments on Bas'
series are even more relevant now. I think long term would be best to bake
these flags in at addfb time when modifiers aren't set. And otherwise
always use the modifiers flag, and completely ignore the legacy flags
here.
-Daniel


There's a set tiling/mod flags IOCTL that can be called after addfb happens,
so unless there's some sort of driver magic preventing this from working
when it's already been pinned for scanout then I don't see anything stopping
this from happening.

I still need to review the modifiers series in a little more detail but that
looks like a good approach to fixing these kind of issues.

Yeah we had the same model for i915, but it's awkward and can surprise
compositors (since the client could change the tiling mode from underneath
the compositor). So freezing the tiling mode at addfb time is the right
thing to do, and anyway how things work with modifiers.

Ofc maybe good to audit the -amd driver, but hopefully it doesn't do
anything silly with changed tiling. If it does, it's kinda sad day.


Marek should know this right away, but I think we only set the tilling 
flags once while exporting the BO and then never change them.


Regards,
Christian.



Btw for i915 we even went a step further, and made the set_tiling ioctl
return an error if a framebuffer for that gem_bo existed. Just to make
sure we don't ever accidentally break this.

Cheers, Daniel


Regards,
Nicholas Kazlauskas


[How]
We unfortunately can't drop this function in its entirety from
prepare_planes since we don't know the afb->address at commit_check
time yet.

So instead of querying new tiling_flags and tmz_surface use the ones
from the plane_state directly.

While we're at it, also update the force_disable_dcc option based
on the state from atomic check.

Cc: Bhawanpreet Lakha 
Cc: Rodrigo Siqueira 
Signed-off-by: Nicholas Kazlauskas 
---
   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 ++-
   1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index bf1881bd492c..f78c09c9585e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -5794,14 +5794,8 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
struct list_head list;
struct ttm_validate_buffer tv;
struct ww_acquire_ctx ticket;
-   uint64_t tiling_flags;
uint32_t domain;
int r;
-   bool tmz_surface = false;
-   bool force_disable_dcc = false;
-
-   dm_plane_state_old = to_dm_plane_state(plane->state);
-   dm_plane_state_new = to_dm_plane_state(new_state);
if (!new_state->fb) {
DRM_DEBUG_DRIVER("No FB bound\n");
@@ -5845,27 +5839,35 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
return r;
}
-   amdgpu_bo_get_tiling_flags(rbo, _flags);
-
-   tmz_surface = amdgpu_bo_encrypted(rbo);
-
ttm_eu_backoff_reservation(, );
afb->address = amdgpu_bo_gpu_offset(rbo);
amdgpu_bo_ref(rbo);
+   /**
+* We don't do surface updates on planes that have been newly created,
+* but we also don't have the afb->address during atomic check.
+*
+* Fill in buffer attributes depending on the address here, but only on
+* newly created planes since they're not being used by DC yet and this
+* won't modify global state.
+*/
+   dm_plane_state_old = to_dm_plane_state(plane->state);
+   dm_plane_state_new = to_dm_plane_state(new_state);
+
if (dm_plane_state_new->dc_state &&
-   dm_plane_state_old->dc_state != 
dm_plane_state_new->dc_state) {
-   struct dc_plane_state *plane_state = 
dm_plane_state_new->dc_state;
+   dm_plane_state_old->dc_state != dm_plane_state_new->dc_state) {
+   struct dc_plane_state *plane_state =
+   dm_plane_state_new->dc_state;
+   bool force_disable_dcc = !plane_state->dcc.enable;
-   force_disable_dcc = adev->asic_type == CHIP_RAVEN && 
adev->in_suspend;
fill_plane_buffer_attributes(
adev, afb, plane_state->format, plane_state->rotation,
-   tiling_flags, _state->tiling_info,
-   _state->plane_size, _state->dcc,
-   _state->address, tmz_surface,
-   force_disable_dcc);
+   

Re: [PATCH 5/7] drm/amd/display: Reset plane for anything that's not a FAST update

2020-08-10 Thread Daniel Vetter
On Fri, Aug 07, 2020 at 10:26:51AM -0400, Kazlauskas, Nicholas wrote:
> On 2020-08-07 4:34 a.m., dan...@ffwll.ch wrote:
> > On Thu, Jul 30, 2020 at 04:36:40PM -0400, Nicholas Kazlauskas wrote:
> > > [Why]
> > > MEDIUM or FULL updates can require global validation or affect
> > > bandwidth. By treating these all simply as surface updates we aren't
> > > actually passing this through DC global validation.
> > > 
> > > [How]
> > > There's currently no way to pass surface updates through DC global
> > > validation, nor do I think it's a good idea to change the interface
> > > to accept these.
> > > 
> > > DC global validation itself is currently stateless, and we can move
> > > our update type checking to be stateless as well by duplicating DC
> > > surface checks in DM based on DRM properties.
> > > 
> > > We wanted to rely on DC automatically determining this since DC knows
> > > best, but DM is ultimately what fills in everything into DC plane
> > > state so it does need to know as well.
> > > 
> > > There are basically only three paths that we exercise in DM today:
> > > 
> > > 1) Cursor (async update)
> > > 2) Pageflip (fast update)
> > > 3) Full pipe programming (medium/full updates)
> > > 
> > > Which means that anything that's more than a pageflip really needs to
> > > go down path #3.
> > > 
> > > So this change duplicates all the surface update checks based on DRM
> > > state instead inside of should_reset_plane().
> > > 
> > > Next step is dropping dm_determine_update_type_for_commit and we no
> > > longer require the old DC state at all for global validation.
> > 
> > I think we do something similar in i915, where we have a "nothing except
> > base address changed" fast path, but for anything else we fully compute a
> > new state. Obviously you should try to keep global state synchronization
> > to a minimum for this step, so it's not entirely only 2 options.
> > 
> > Once we have the states, we compare them and figure out whether we can get
> > away with a fast modeset operation (maybe what you guys call medium
> > update). Anyway I think being slightly more aggressive with computing full
> > state, and then falling back to more optimized update again is a good
> > approach. Only risk is if we you have too much synchronization in your
> > locking (e.g. modern compositors do like to change tiling and stuff,
> > especially once you have modifiers enabled, so this shouldn't cause a sync
> > across crtc except when absolutely needed).
> > -Daniel
> 
> Sounds like the right approach then.
> 
> We can support tiling changes in the fast path, but the more optimized
> version of that last check is really linear <-> tiled. That requires global
> validation with DC to revalidate bandwidth and calculate requestor
> parameters for HW. So we'll have to stall for some of these changes
> unfortunately since we need the full HW state for validation.

Yeah I think that's perfectly ok, and probably worth it to still optimize
for tiled->tiled changes. If you can. tiled<->untiled only happens for
boot-splash, so no one cares, but with modifiers the idea is that tiling
changes (especially once we get into the different compression modes
amdgpu support, and which Bas' series tries to enable) are fairly common
and should be doable without any stalls anywhere.

Cheers, Daniel

> 
> Regards,
> Nicholas Kazlauskas
> 
> > 
> > > 
> > > Optimization can come later so we don't reset DC planes at all for
> > > MEDIUM udpates and avoid validation, but we might require some extra
> > > checks in DM to achieve this.
> > > 
> > > Cc: Bhawanpreet Lakha 
> > > Cc: Hersen Wu 
> > > Signed-off-by: Nicholas Kazlauskas 
> > > ---
> > >   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 25 +++
> > >   1 file changed, 25 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > index 0d5f45742bb5..2cbb29199e61 100644
> > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > @@ -8336,6 +8336,31 @@ static bool should_reset_plane(struct 
> > > drm_atomic_state *state,
> > >   if (old_other_state->crtc != new_other_state->crtc)
> > >   return true;
> > > + /* Src/dst size and scaling updates. */
> > > + if (old_other_state->src_w != new_other_state->src_w ||
> > > + old_other_state->src_h != new_other_state->src_h ||
> > > + old_other_state->crtc_w != new_other_state->crtc_w ||
> > > + old_other_state->crtc_h != new_other_state->crtc_h)
> > > + return true;
> > > +
> > > + /* Rotation / mirroring updates. */
> > > + if (old_other_state->rotation != new_other_state->rotation)
> > > + return true;
> > > +
> > > + /* Blending updates. */
> > > + if (old_other_state->pixel_blend_mode !=
> > > + 

Re: [PATCH 6/8] drm/amd/display: Set DC options from modifiers.

2020-08-10 Thread Daniel Vetter
On Wed, Aug 05, 2020 at 09:32:10AM +0200, dan...@ffwll.ch wrote:
> On Tue, Aug 04, 2020 at 11:31:17PM +0200, Bas Nieuwenhuizen wrote:
> > This sets the DC tiling options from the modifier, if modifiers
> > are used for the FB. This patch by itself does not expose the
> > support yet though.
> > 
> > There is not much validation yet to limit the scope of this
> > patch, but the current validation is at the same level as
> > the BO metadata path.
> > 
> > Signed-off-by: Bas Nieuwenhuizen 
> > ---
> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 109 +-
> >  1 file changed, 103 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 6ef7f2f8acab..ac913b8f10ef 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -3754,6 +3754,93 @@ fill_gfx9_plane_attributes_from_flags(struct 
> > amdgpu_device *adev,
> > return 0;
> >  }
> >  
> > +static bool
> > +modifier_has_dcc(uint64_t modifier)
> > +{
> > +   return IS_AMD_FMT_MOD(modifier) && AMD_FMT_MOD_GET(DCC, modifier);
> > +}
> > +
> > +static unsigned
> > +modifier_gfx9_swizzle_mode(uint64_t modifier)
> > +{
> > +   if (modifier == DRM_FORMAT_MOD_LINEAR)
> > +   return 0;
> > +
> > +   return AMD_FMT_MOD_GET(TILE, modifier);
> > +}
> > +
> > +static void
> > +fill_gfx9_tiling_info_from_modifier(const struct amdgpu_device *adev,
> > + union dc_tiling_info *tiling_info,
> > + uint64_t modifier)
> > +{
> > +   unsigned int mod_bank_xor_bits = AMD_FMT_MOD_GET(BANK_XOR_BITS, 
> > modifier);
> > +   unsigned int mod_pipe_xor_bits = AMD_FMT_MOD_GET(PIPE_XOR_BITS, 
> > modifier);
> > +   unsigned int pkrs_log2 = AMD_FMT_MOD_GET(PACKERS, modifier);
> > +   unsigned int pipes_log2 = min(4u, mod_pipe_xor_bits);
> > +
> > +   fill_gfx9_tiling_info_from_device(adev, tiling_info);
> > +
> > +   if (!IS_AMD_FMT_MOD(modifier))
> > +   return;
> > +
> > +   tiling_info->gfx9.num_pipes = 1u << pipes_log2;
> > +   tiling_info->gfx9.num_shader_engines = 1u << (mod_pipe_xor_bits - 
> > pipes_log2);
> > +
> > +   if (adev->family >= AMDGPU_FAMILY_NV) {
> > +   tiling_info->gfx9.num_pkrs = 1u << pkrs_log2;
> > +   } else {
> > +   tiling_info->gfx9.num_banks = 1u << mod_bank_xor_bits;
> > +
> > +   /* for DCC we know it isn't rb aligned, so rb_per_se doesn't 
> > matter. */
> > +   }
> > +}
> > +
> > +static void
> > +block_alignment(unsigned int blocksize_log2, unsigned int *width, unsigned 
> > int *height)
> > +{
> > +   unsigned int height_log2 = blocksize_log2 / 2;
> > +   unsigned int width_log2 = blocksize_log2 - height_log2;
> > +
> > +   *width = 1u << width_log2;
> > +   *height = 1u << height_log2;
> > +}
> > +
> > +static int
> > +fill_gfx9_plane_attributes_from_modifiers(struct amdgpu_device *adev,
> > + const struct amdgpu_framebuffer *afb,
> > + const enum surface_pixel_format format,
> > + const enum dc_rotation_angle rotation,
> > + const struct plane_size *plane_size,
> > + union dc_tiling_info *tiling_info,
> > + struct dc_plane_dcc_param *dcc,
> > + struct dc_plane_address *address,
> > + const bool force_disable_dcc)
> > +{
> > +   const uint64_t modifier = afb->base.modifier;
> > +   int ret;
> > +
> > +   fill_gfx9_tiling_info_from_modifier(adev, tiling_info, modifier);
> > +   tiling_info->gfx9.swizzle = modifier_gfx9_swizzle_mode(modifier);
> > +
> > +   if (modifier_has_dcc(modifier) && !force_disable_dcc) {
> > +   uint64_t dcc_address = afb->address + afb->base.offsets[1];
> > +
> > +   dcc->enable = 1;
> > +   dcc->meta_pitch = afb->base.pitches[1];
> > +   dcc->independent_64b_blks = 
> > AMD_FMT_MOD_GET(DCC_INDEPENDENT_64B, modifier);
> > +
> > +   address->grph.meta_addr.low_part = lower_32_bits(dcc_address);
> > +   address->grph.meta_addr.high_part = upper_32_bits(dcc_address);
> > +   }
> > +
> > +   ret = validate_dcc(adev, format, rotation, tiling_info, dcc, address, 
> > plane_size);
> > +   if (ret)
> > +   return ret;
> > +
> > +   return 0;
> > +}
> > +
> >  static int
> >  fill_plane_buffer_attributes(struct amdgpu_device *adev,
> >  const struct amdgpu_framebuffer *afb,
> > @@ -3823,12 +3910,22 @@ fill_plane_buffer_attributes(struct amdgpu_device 
> > *adev,
> >  
> >  
> > if (adev->family >= AMDGPU_FAMILY_AI) {
> > -   ret = fill_gfx9_plane_attributes_from_flags(adev, afb, format, 
> > rotation,
> > -   plane_size, 
> > tiling_info, dcc,
> > 

Re: [PATCH 3/7] drm/amd/display: Avoid using unvalidated tiling_flags and tmz_surface in prepare_planes

2020-08-10 Thread Daniel Vetter
On Fri, Aug 07, 2020 at 10:29:09AM -0400, Kazlauskas, Nicholas wrote:
> On 2020-08-07 4:30 a.m., dan...@ffwll.ch wrote:
> > On Thu, Jul 30, 2020 at 04:36:38PM -0400, Nicholas Kazlauskas wrote:
> > > [Why]
> > > We're racing with userspace as the flags could potentially change
> > > from when we acquired and validated them in commit_check.
> > 
> > Uh ... I didn't know these could change. I think my comments on Bas'
> > series are even more relevant now. I think long term would be best to bake
> > these flags in at addfb time when modifiers aren't set. And otherwise
> > always use the modifiers flag, and completely ignore the legacy flags
> > here.
> > -Daniel
> > 
> 
> There's a set tiling/mod flags IOCTL that can be called after addfb happens,
> so unless there's some sort of driver magic preventing this from working
> when it's already been pinned for scanout then I don't see anything stopping
> this from happening.
> 
> I still need to review the modifiers series in a little more detail but that
> looks like a good approach to fixing these kind of issues.

Yeah we had the same model for i915, but it's awkward and can surprise
compositors (since the client could change the tiling mode from underneath
the compositor). So freezing the tiling mode at addfb time is the right
thing to do, and anyway how things work with modifiers.

Ofc maybe good to audit the -amd driver, but hopefully it doesn't do
anything silly with changed tiling. If it does, it's kinda sad day.

Btw for i915 we even went a step further, and made the set_tiling ioctl
return an error if a framebuffer for that gem_bo existed. Just to make
sure we don't ever accidentally break this.

Cheers, Daniel

> 
> Regards,
> Nicholas Kazlauskas
> 
> > > 
> > > [How]
> > > We unfortunately can't drop this function in its entirety from
> > > prepare_planes since we don't know the afb->address at commit_check
> > > time yet.
> > > 
> > > So instead of querying new tiling_flags and tmz_surface use the ones
> > > from the plane_state directly.
> > > 
> > > While we're at it, also update the force_disable_dcc option based
> > > on the state from atomic check.
> > > 
> > > Cc: Bhawanpreet Lakha 
> > > Cc: Rodrigo Siqueira 
> > > Signed-off-by: Nicholas Kazlauskas 
> > > ---
> > >   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 ++-
> > >   1 file changed, 19 insertions(+), 17 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > index bf1881bd492c..f78c09c9585e 100644
> > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > > @@ -5794,14 +5794,8 @@ static int dm_plane_helper_prepare_fb(struct 
> > > drm_plane *plane,
> > >   struct list_head list;
> > >   struct ttm_validate_buffer tv;
> > >   struct ww_acquire_ctx ticket;
> > > - uint64_t tiling_flags;
> > >   uint32_t domain;
> > >   int r;
> > > - bool tmz_surface = false;
> > > - bool force_disable_dcc = false;
> > > -
> > > - dm_plane_state_old = to_dm_plane_state(plane->state);
> > > - dm_plane_state_new = to_dm_plane_state(new_state);
> > >   if (!new_state->fb) {
> > >   DRM_DEBUG_DRIVER("No FB bound\n");
> > > @@ -5845,27 +5839,35 @@ static int dm_plane_helper_prepare_fb(struct 
> > > drm_plane *plane,
> > >   return r;
> > >   }
> > > - amdgpu_bo_get_tiling_flags(rbo, _flags);
> > > -
> > > - tmz_surface = amdgpu_bo_encrypted(rbo);
> > > -
> > >   ttm_eu_backoff_reservation(, );
> > >   afb->address = amdgpu_bo_gpu_offset(rbo);
> > >   amdgpu_bo_ref(rbo);
> > > + /**
> > > +  * We don't do surface updates on planes that have been newly created,
> > > +  * but we also don't have the afb->address during atomic check.
> > > +  *
> > > +  * Fill in buffer attributes depending on the address here, but only on
> > > +  * newly created planes since they're not being used by DC yet and this
> > > +  * won't modify global state.
> > > +  */
> > > + dm_plane_state_old = to_dm_plane_state(plane->state);
> > > + dm_plane_state_new = to_dm_plane_state(new_state);
> > > +
> > >   if (dm_plane_state_new->dc_state &&
> > > - dm_plane_state_old->dc_state != 
> > > dm_plane_state_new->dc_state) {
> > > - struct dc_plane_state *plane_state = 
> > > dm_plane_state_new->dc_state;
> > > + dm_plane_state_old->dc_state != dm_plane_state_new->dc_state) {
> > > + struct dc_plane_state *plane_state =
> > > + dm_plane_state_new->dc_state;
> > > + bool force_disable_dcc = !plane_state->dcc.enable;
> > > - force_disable_dcc = adev->asic_type == CHIP_RAVEN && 
> > > adev->in_suspend;
> > >   fill_plane_buffer_attributes(
> > >   adev, afb, plane_state->format, 
> > > plane_state->rotation,
> > > - 

RE: [PATCH 12/15] drm/amd/display: Use hw lock mgr

2020-08-10 Thread Wood, Wyatt
The condition in which it’s enabled is hidden from upstream.

From: Brandon Wright 
Sent: Saturday, August 8, 2020 10:11 PM
To: Brol, Eryk ; amd-gfx@lists.freedesktop.org; Wood, Wyatt 

Subject: Re: [PATCH 12/15] drm/amd/display: Use hw lock mgr

Just curious, but I noticed this new lock manager isn't being used because of 
the definitive false return value in 
display/dc/dce/dmub_hw_lock_mgr.c:should_use_dmub_lock.
Was this supposed to be enabled?


From: Wyatt Wood mailto:wyatt.w...@amd.com>>



[Why]

Feature requires synchronization of dig, pipe, and cursor locking

between driver and fw.

[How]

Set flag to force psr to use hw lock mgr.



Signed-off-by: Wyatt Wood mailto:wyatt.w...@amd.com>>

Reviewed-by: Anthony Koo mailto:anthony@amd.com>>

Acked-by: Eryk Brol mailto:eryk.b...@amd.com>>

---

 drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c | 2 +-

 1 file changed, 1 insertion(+), 1 deletion(-)



diff --git a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c

b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c

index 82e67bd81f2d..5167d6b8a48d 100644

--- a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c

+++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c

@@ -233,8 +233,8 @@ static bool dmub_psr_copy_settings(struct dmub_psr *dmub,

copy_settings_data->frame_cap_ind   =

psr_context->psrFrameCaptureIndicationReq;

copy_settings_data->debug.bitfields.visual_confirm  =

dc->dc->debug.visual_confirm == VISUAL_CONFIRM_PSR ?

true :

false;

+   copy_settings_data->debug.bitfields.use_hw_lock_mgr = 1;

copy_settings_data->init_sdp_deadline   =

psr_context->sdpTransmitLineNumDeadline;

-   copy_settings_data->debug.bitfields.use_hw_lock_mgr = 0;



dc_dmub_srv_cmd_queue(dc->dmub_srv, );

dc_dmub_srv_cmd_execute(dc->dmub_srv);

--

2.25.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Non-deterministically boot into dark screen with `amdgpu`

2020-08-10 Thread Christian König

Hi guys,

Am 10.08.20 um 08:43 schrieb Alexander Monakov:

Hi,

you should Сс a specialized mailing list and a relevant maintainer,
otherwise your email is likely to be ignored as LKML is an incredibly
high-volume list. Adding amd-gfx and Alex Deucher.


Thanks for forwarding this. AFAIK we haven't heard of this bug before, 
but Alex already might know more about it.



More thoughts below.

On Sun, 9 Aug 2020, Ignat Insarov wrote:


Hello!

This is an issue report. I am not familiar with the Linux kernel
development procedure, so please direct me to a more appropriate or
specialized medium if this is not the right avenue.

My laptop (Ryzen 7 Pro CPU/GPU) boots into dark screen more often than
not. Screen blackness correlates with a line in the `systemd` journal
that says `RAM width Nbits DDR4`, where N is either 128 (resulting in
dark screen) or 64 (resulting in a healthy boot). The number seems to
be chosen at random with bias towards 128. This has been going on for
a while so here is some statistics:

* 356 boots proceed far enough to  attempt mode setting.
* 82 boots set RAM width to 64 bits and presumably succeed.
* 274 boots set RAM width to 128 bits and presumably fail.

The issue is prevented with the `nomodeset` kernel option.

I reported this previously (about a year ago) on the forum of my Linux
distribution.[1] The issue still persists as of  linux 5.8.0.

The details of my graphics controller, as well as some journal
excerpts, can be seen at [1]. One thing that has changed since then is
that on failure, there now appears a null pointer dereference error. I
am attaching the log of kernel messages from the most recent failed
boot — please request more information if needed.

I appreciate any directions and advice as to how I may go about fixing
this annoyance.

[1]: https://bbs.archlinux.org/viewtopic.php?id=248273


On the forum you show that in the "success" case there's one less "BIOS
signature incorrect" message. This implies that amdgpu_get_bios() in
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
gets the video BIOS from a different source. If that happens every time
(one "signature incorrect" message for "success", two for "failure")
that may be relevant to the problem you're experiencing.

If you don't mind patching and rebuilding the kernel I suggest adding
debug printks to the aforementioned function to see exactly which methods
fail with wrong signature and which succeeds.

Also might be worthwhile to check if there's a BIOS update for your laptop.


It might also be a good idea to try the latest amd-staging-drm-next 
branch from Alex repository (bear with me I don't have the link at hand, 
but it should be easy to find).


Opening a bug report or searching the existing ones for something 
similar under https://gitlab.freedesktop.org/drm/amd/-/issues might be a 
good idea as well.


And I completely agree that this sounds like an issue getting the BIOS 
image.


Thanks,
Christian.



Alexander

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix reload KMD hang on GFX10 KIQ

2020-08-10 Thread Christian König

Am 10.08.20 um 05:59 schrieb Monk Liu:

GFX10 KIQ will hang if we try below steps:
modprobe amdgpu
rmmod amdgpu
modprobe amdgpu sched_hw_submission=4

Due to KIQ is always living there even after KMD unloaded
thus when doing the realod KIQ will crash upon its register
being programed by different values with the previous loading
(the config like HQD addr, ring size, is easily changed if we alter
the sched_hw_submission)

the fix is we must inactive KIQ first before touching any
of its registgers

Signed-off-by: Monk Liu 
---
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 622f442..0702c94 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -6435,6 +6435,10 @@ static int gfx_v10_0_kiq_init_register(struct 
amdgpu_ring *ring)
struct v10_compute_mqd *mqd = ring->mqd_ptr;
int j;
  
+	/* inactivate the queue */

+   if (amdgpu_sriov_vf(adev))


Could you think of any reason why we shouldn't do this on bare metal as 
well?


I mean it can't hurt to be extra careful even if the KIQ shouldn't be 
running.


Christian.


+   WREG32_SOC15(GC, 0, mmCP_HQD_ACTIVE, 0);
+
/* disable wptr polling */
WREG32_FIELD15(GC, 0, CP_PQ_WPTR_POLL_CNTL, EN, 0);
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] dma-buf.rst: repair length of title underline

2020-08-10 Thread Christian König

Am 09.08.20 um 08:17 schrieb Lukas Bulwahn:

With commit 72b6ede73623 ("dma-buf.rst: Document why indefinite fences are
a bad idea"), document generation warns:

   Documentation/driver-api/dma-buf.rst:182: \
   WARNING: Title underline too short.

Repair length of title underline to remove warning.

Fixes: 72b6ede73623 ("dma-buf.rst: Document why indefinite fences are a bad 
idea")
Signed-off-by: Lukas Bulwahn 


Acked-by: Christian König 

Should I pick it up into drm-misc-next?


---
Daniel, please pick this minor non-urgent fix to your new documentation.

  Documentation/driver-api/dma-buf.rst | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/driver-api/dma-buf.rst 
b/Documentation/driver-api/dma-buf.rst
index 100bfd227265..13ea0cc0a3fa 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -179,7 +179,7 @@ DMA Fence uABI/Sync File
 :internal:
  
  Indefinite DMA Fences

-
+~
  
  At various times _fence with an indefinite time until dma_fence_wait()

  finishes have been proposed. Examples include:


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64

2020-08-10 Thread Christian König

Am 10.08.20 um 12:50 schrieb Michel Dänzer:

On 2020-08-09 2:13 p.m., Christian König wrote:

Am 08.08.20 um 15:50 schrieb Jiaxun Yang:

在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道:

On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote:

Loongson processors have a writecombine issue that maybe failed to
write back framebuffer used with ATI Radeon or AMD GPU at times,
after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
mapping for MIPS"), there exists some errors such as blurred screen
and lockup, and so on.

Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for
Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any
influence on the other platforms.

well it's not my call to take or reject this patch, but I already
indicated it might be better to disable writecombine on the CPU
detection side (or do you have other devices where writecombining
works ?). Something like below will disbale it for all loongson64 CPUs.
If you now find out where it works and where it doesn't, you can even
reduce it to the required minium of affected CPUs.

Hi Tiezhu, Thomas,

Yes, writecombine works well on LS7A's internal GPU
And even works well with some AMD GPUs (in my case, RX550).

In this case the patch is a clear NAK since you haven't root caused the
issue and are just working around it in a very questionable manner.

To be fair though, amdgpu & radeon are already disabling write-combining
for system memory pages in 32-bit x86 kernels for similar reasons.


Yeah, well that is USWC for system memory. But this is about WC for the 
VRAM BAR.


When we don't understand or don't correctly implement something on the 
platform for USWC then this is annoying, but not a serious issue.


But when the hardware doesn't correctly implement WC for PCIe BARs, then 
this is a violation of the PCIe spec and a bit more serious issue for 
the whole platform.


We can work around that by disabling WC for PCIe BARs on the whole 
platform, or behind specific bridges or or or, but patching each 
individual driver so that they work is not really the right approach.


Cheers,
Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64

2020-08-10 Thread Michel Dänzer
On 2020-08-09 2:13 p.m., Christian König wrote:
> Am 08.08.20 um 15:50 schrieb Jiaxun Yang:
>> 在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道:
>>> On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote:
 Loongson processors have a writecombine issue that maybe failed to
 write back framebuffer used with ATI Radeon or AMD GPU at times,
 after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
 mapping for MIPS"), there exists some errors such as blurred screen
 and lockup, and so on.

 Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for
 Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any
 influence on the other platforms.
>>> well it's not my call to take or reject this patch, but I already
>>> indicated it might be better to disable writecombine on the CPU
>>> detection side (or do you have other devices where writecombining
>>> works ?). Something like below will disbale it for all loongson64 CPUs.
>>> If you now find out where it works and where it doesn't, you can even
>>> reduce it to the required minium of affected CPUs.
>> Hi Tiezhu, Thomas,
>>
>> Yes, writecombine works well on LS7A's internal GPU
>> And even works well with some AMD GPUs (in my case, RX550).
> 
> In this case the patch is a clear NAK since you haven't root caused the
> issue and are just working around it in a very questionable manner.

To be fair though, amdgpu & radeon are already disabling write-combining
for system memory pages in 32-bit x86 kernels for similar reasons.


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm: amdgpu: Use the correct size when allocating memory

2020-08-10 Thread Christian König

Am 09.08.20 um 22:34 schrieb Christophe JAILLET:

When '*sgt' is allocated, we must allocated 'sizeof(**sgt)' bytes instead
of 'sizeof(*sg)'. 'sg' (i.e. struct scatterlist) is smaller than
'sgt' (i.e struct sg_table), so this could lead to memory corruption.

Fixes: f44ffd677fb3 ("drm/amdgpu: add support for exporting VRAM using DMA-buf 
v3")
Signed-off-by: Christophe JAILLET 


Good catch, Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 134cc36e30c5..0739e259bf91 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -462,7 +462,7 @@ int amdgpu_vram_mgr_alloc_sgt(struct amdgpu_device *adev,
unsigned int pages;
int i, r;
  
-	*sgt = kmalloc(sizeof(*sg), GFP_KERNEL);

+   *sgt = kmalloc(sizeof(**sgt), GFP_KERNEL);
if (!*sgt)
return -ENOMEM;
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Non-deterministically boot into dark screen with `amdgpu`

2020-08-10 Thread Alexander Monakov
Hi,

you should Сс a specialized mailing list and a relevant maintainer,
otherwise your email is likely to be ignored as LKML is an incredibly
high-volume list. Adding amd-gfx and Alex Deucher.

More thoughts below.

On Sun, 9 Aug 2020, Ignat Insarov wrote:

> Hello!
> 
> This is an issue report. I am not familiar with the Linux kernel
> development procedure, so please direct me to a more appropriate or
> specialized medium if this is not the right avenue.
> 
> My laptop (Ryzen 7 Pro CPU/GPU) boots into dark screen more often than
> not. Screen blackness correlates with a line in the `systemd` journal
> that says `RAM width Nbits DDR4`, where N is either 128 (resulting in
> dark screen) or 64 (resulting in a healthy boot). The number seems to
> be chosen at random with bias towards 128. This has been going on for
> a while so here is some statistics:
> 
> * 356 boots proceed far enough to  attempt mode setting.
> * 82 boots set RAM width to 64 bits and presumably succeed.
> * 274 boots set RAM width to 128 bits and presumably fail.
> 
> The issue is prevented with the `nomodeset` kernel option.
> 
> I reported this previously (about a year ago) on the forum of my Linux
> distribution.[1] The issue still persists as of  linux 5.8.0.
> 
> The details of my graphics controller, as well as some journal
> excerpts, can be seen at [1]. One thing that has changed since then is
> that on failure, there now appears a null pointer dereference error. I
> am attaching the log of kernel messages from the most recent failed
> boot — please request more information if needed.
> 
> I appreciate any directions and advice as to how I may go about fixing
> this annoyance.
> 
> [1]: https://bbs.archlinux.org/viewtopic.php?id=248273


On the forum you show that in the "success" case there's one less "BIOS
signature incorrect" message. This implies that amdgpu_get_bios() in
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
gets the video BIOS from a different source. If that happens every time
(one "signature incorrect" message for "success", two for "failure")
that may be relevant to the problem you're experiencing.

If you don't mind patching and rebuilding the kernel I suggest adding
debug printks to the aforementioned function to see exactly which methods
fail with wrong signature and which succeeds.

Also might be worthwhile to check if there's a BIOS update for your laptop.

Alexander___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64

2020-08-10 Thread Tiezhu Yang

On 08/09/2020 08:13 PM, Christian König wrote:

Am 08.08.20 um 15:50 schrieb Jiaxun Yang:



在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道:

On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote:

Loongson processors have a writecombine issue that maybe failed to
write back framebuffer used with ATI Radeon or AMD GPU at times,
after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine
mapping for MIPS"), there exists some errors such as blurred screen
and lockup, and so on.

Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for
Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any
influence on the other platforms.

well it's not my call to take or reject this patch, but I already
indicated it might be better to disable writecombine on the CPU
detection side (or do you have other devices where writecombining
works ?). Something like below will disbale it for all loongson64 CPUs.
If you now find out where it works and where it doesn't, you can even
reduce it to the required minium of affected CPUs.

Hi Tiezhu, Thomas,

Yes, writecombine works well on LS7A's internal GPU
And even works well with some AMD GPUs (in my case, RX550).


In this case the patch is a clear NAK since you haven't root caused 
the issue and are just working around it in a very questionable manner.




Tiezhu, is it possible to investigate the issue deeper in Loongson?
Probably we just need to add some barrier to maintain the data 
coherency,
or disable writecombine for AMD GPU's command buffer and leave 
texture/frame

buffer wc accelerated.


Have you moved any buffer to VRAM and forgot to add an HDP 
flush/invalidate?


The acceleration is not much of a problem, but if WC doesn't work in 
general you need to disable it for the whole CPU and not for 
individual drivers.


Hi Thomas, Jiaxun and Christian,

Thank you very much for your suggestions.

Actually, this patch is a temporary solution to just make it work well,
it is not a proper and final solution.

I understand your opinions, it will take some time to find the root cause.

Thanks,
Tiezhu



Regards,
Christian.



Thanks.

- Jiaxun


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/2] drm/amdgpu: add debugfs node to toggle ras error cnt harvest

2020-08-10 Thread Zhang, Hawking
[AMD Public Use]

Series is

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: Chen, Guchun  
Sent: Monday, August 10, 2020 13:23
To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Li, 
Dennis ; Lazar, Lijo ; Zhou1, Tao 
; Clements, John 
Cc: Chen, Guchun 
Subject: [PATCH 2/2] drm/amdgpu: add debugfs node to toggle ras error cnt 
harvest

Before ras recovery is issued, user could operate this debugfs node to 
enable/disable the harvest of all RAS IPs' ras error count registers, which 
will help keep hardware's registers'
status instead of cleaning up them.

Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index e6978b8e2143..31df6bf2dc1f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1215,6 +1215,13 @@ static void amdgpu_ras_debugfs_create_ctrl_node(struct 
amdgpu_device *adev)
 */
debugfs_create_bool("auto_reboot", S_IWUGO | S_IRUGO, con->dir,
>reboot);
+
+   /*
+* User could set this not to clean up hardware's error count register
+* of RAS IPs during ras recovery.
+*/
+   debugfs_create_bool("disable_ras_err_cnt_harvest", 0644,
+   con->dir, >disable_ras_err_cnt_harvest);
 }
 
 void amdgpu_ras_debugfs_create(struct amdgpu_device *adev,
--
2.17.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx