RE: [PATCH 3/3] drm/amdgpu: add debugfs interface for RAP test
[AMD Public Use] Hi Gunchun, Thanks for pointing this out, for rap_test file node, I thought I need to use the same file mode as ras previously, it seems that I'm wrong, for this rap_test file node, I think only S_IWUSR is enough, what's your opinion? Brs Wenhui -Original Message- From: Chen, Guchun Sent: Tuesday, August 11, 2020 11:33 AM To: Sheng, Wenhui ; amd-gfx@lists.freedesktop.org Cc: Clements, John ; Sheng, Wenhui ; Zhang, Hawking Subject: RE: [PATCH 3/3] drm/amdgpu: add debugfs interface for RAP test [AMD Public Use] One comment inline, please check. Regards, Guchun -Original Message- From: amd-gfx On Behalf Of Wenhui Sheng Sent: Tuesday, August 11, 2020 11:04 AM To: amd-gfx@lists.freedesktop.org Cc: Clements, John ; Sheng, Wenhui ; Zhang, Hawking Subject: [PATCH 3/3] drm/amdgpu: add debugfs interface for RAP test After amdgpu driver loading successfully, we can use RAP debugfs interface /dri/xxx/rap_test to trigger RAP test. Currently only L0 validate test is supported. v2: refine amdgpu_rap.h Signed-off-by: Wenhui Sheng --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 3 + drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c | 127 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.h | 28 + 4 files changed, 159 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.h diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile index 0ba396e9d7e4..dec1927ca75d 100644 --- a/drivers/gpu/drm/amd/amdgpu/Makefile +++ b/drivers/gpu/drm/amd/amdgpu/Makefile @@ -55,7 +55,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o \ amdgpu_gmc.o amdgpu_mmhub.o amdgpu_xgmi.o amdgpu_csa.o amdgpu_ras.o amdgpu_vm_cpu.o \ amdgpu_vm_sdma.o amdgpu_discovery.o amdgpu_ras_eeprom.o amdgpu_nbio.o \ - amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o + amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o amdgpu-$(CONFIG_PERF_EVENTS) += amdgpu_pmu.o diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index 3a4b31b1c4f2..0af249a1e35b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -34,6 +34,7 @@ #include "amdgpu_pm.h" #include "amdgpu_dm_debugfs.h" #include "amdgpu_ras.h" +#include "amdgpu_rap.h" /** * amdgpu_debugfs_add_files - Add simple debugfs entries @@ -1623,6 +1624,8 @@ int amdgpu_debugfs_init(struct amdgpu_device *adev) amdgpu_debugfs_autodump_init(adev); + amdgpu_rap_debugfs_init(adev); + return amdgpu_debugfs_add_files(adev, amdgpu_debugfs_list, ARRAY_SIZE(amdgpu_debugfs_list)); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c new file mode 100644 index ..46b2607a7bd1 --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c @@ -0,0 +1,127 @@ +/* + * Copyright 2020 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person +obtaining a + * copy of this software and associated documentation files (the +"Software"), + * to deal in the Software without restriction, including without +limitation + * the rights to use, copy, modify, merge, publish, distribute, +sublicense, + * and/or sell copies of the Software, and to permit persons to whom +the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be +included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT +SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, +DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR +OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE +OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * + */ +#include +#include + +#include "amdgpu.h" +#include "amdgpu_rap.h" + +/** + * DOC: AMDGPU RAP debugfs test interface + * + * how to use? + * echo opcode > /dri/xxx/rap_test + * + * opcode: + * currently, only 2 is supported by Linux host driver, + * opcode 2 stands for TA_CMD_RAP__VALIDATE_L0, used to + * trigger L0 policy validation, you can refer more detail + * from header file ta_rap_if.h + * + */ +static ssize_t amdgpu_rap_debugfs_write(struct file *f, const char __user *buf, + size_t size, loff_t *pos) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private; + struct
[PATCH] drm/amd/powerplay: minor cleanups
Drop unnecessary lock protections during hw setup which was confirmed to have no race condition. Drop also unnecessary null pointer checker. Change-Id: Ida301ae7bad1abae15285c4e019eda4f7dc6e297 Signed-off-by: Evan Quan --- drivers/gpu/drm/amd/powerplay/amdgpu_smu.c| 20 drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 2 - .../gpu/drm/amd/powerplay/inc/amdgpu_smu.h| 1 - drivers/gpu/drm/amd/powerplay/navi10_ppt.c| 2 - .../drm/amd/powerplay/sienna_cichlid_ppt.c| 2 - drivers/gpu/drm/amd/powerplay/smu_internal.h | 1 + drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 50 --- 7 files changed, 11 insertions(+), 67 deletions(-) diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c index 1ffacc712e53..c70f94377644 100644 --- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c @@ -361,20 +361,16 @@ static int smu_get_driver_allowed_feature_mask(struct smu_context *smu) int ret = 0; uint32_t allowed_feature_mask[SMU_FEATURE_MAX/32]; - mutex_lock(>mutex); bitmap_zero(feature->allowed, SMU_FEATURE_MAX); - mutex_unlock(>mutex); ret = smu_get_allowed_feature_mask(smu, allowed_feature_mask, SMU_FEATURE_MAX/32); if (ret) return ret; - mutex_lock(>mutex); bitmap_or(feature->allowed, feature->allowed, (unsigned long *)allowed_feature_mask, feature->feature_num); - mutex_unlock(>mutex); return ret; } @@ -576,9 +572,6 @@ static int smu_fini_fb_allocations(struct smu_context *smu) struct smu_table *tables = smu_table->tables; struct smu_table *driver_table = &(smu_table->driver_table); - if (!tables) - return 0; - if (tables[SMU_TABLE_PMSTATUSLOG].mc_address) amdgpu_bo_free_kernel([SMU_TABLE_PMSTATUSLOG].bo, [SMU_TABLE_PMSTATUSLOG].mc_address, @@ -2250,19 +2243,6 @@ int smu_set_deep_sleep_dcefclk(struct smu_context *smu, int clk) return ret; } -int smu_set_active_display_count(struct smu_context *smu, uint32_t count) -{ - int ret = 0; - - if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled) - return -EOPNOTSUPP; - - if (smu->ppt_funcs->set_active_display_count) - ret = smu->ppt_funcs->set_active_display_count(smu, count); - - return ret; -} - int smu_get_clock_by_type(struct smu_context *smu, enum amd_pp_clock_type type, struct amd_pp_clocks *clocks) diff --git a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c index 8b1025dc54fd..25679d60f7b4 100644 --- a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c +++ b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c @@ -386,11 +386,9 @@ static int arcturus_check_powerplay_table(struct smu_context *smu) table_context->power_play_table; struct smu_baco_context *smu_baco = >smu_baco; - mutex_lock(_baco->mutex); if (powerplay_table->platform_caps & SMU_11_0_PP_PLATFORM_CAP_BACO || powerplay_table->platform_caps & SMU_11_0_PP_PLATFORM_CAP_MACO) smu_baco->platform_support = true; - mutex_unlock(_baco->mutex); table_context->thermal_controller_type = powerplay_table->thermal_controller_type; diff --git a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h index 23c2279bd500..8de39b31e7c2 100644 --- a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h @@ -698,7 +698,6 @@ int smu_set_fan_speed_percent(struct smu_context *smu, uint32_t speed); int smu_get_fan_speed_rpm(struct smu_context *smu, uint32_t *speed); int smu_set_deep_sleep_dcefclk(struct smu_context *smu, int clk); -int smu_set_active_display_count(struct smu_context *smu, uint32_t count); int smu_get_clock_by_type(struct smu_context *smu, enum amd_pp_clock_type type, diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c index 42a60769c52f..61e2971be9f3 100644 --- a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c +++ b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c @@ -346,11 +346,9 @@ static int navi10_check_powerplay_table(struct smu_context *smu) if (powerplay_table->platform_caps & SMU_11_0_PP_PLATFORM_CAP_HARDWAREDC) smu->dc_controlled_by_gpio = true; - mutex_lock(_baco->mutex); if (powerplay_table->platform_caps & SMU_11_0_PP_PLATFORM_CAP_BACO || powerplay_table->platform_caps & SMU_11_0_PP_PLATFORM_CAP_MACO) smu_baco->platform_support = true; - mutex_unlock(_baco->mutex);
[PATCH] drm/amd/powerplay: bump NAVI12 driver if version
To fit the latest SMU firmware. Change-Id: Ic9d02de54d20b6b90d18bac8b3fbb356d8fdf3ad Signed-off-by: Evan Quan --- drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h b/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h index ee1506beb0ea..65363d56e3cc 100644 --- a/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h +++ b/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h @@ -28,7 +28,7 @@ #define SMU11_DRIVER_IF_VERSION_INV 0x #define SMU11_DRIVER_IF_VERSION_ARCT 0x17 #define SMU11_DRIVER_IF_VERSION_NV10 0x36 -#define SMU11_DRIVER_IF_VERSION_NV12 0x33 +#define SMU11_DRIVER_IF_VERSION_NV12 0x36 #define SMU11_DRIVER_IF_VERSION_NV14 0x36 #define SMU11_DRIVER_IF_VERSION_Sienna_Cichlid 0x35 #define SMU11_DRIVER_IF_VERSION_Navy_Flounder 0x3 -- 2.28.0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 2/2] drm/amd/powerplay: maximum the code sharing around metrics table retrieving
Instead of having one copy in each ASIC. Change-Id: I5e2a72382700cdb0e4847e5d9e2143f4b5508cdb Signed-off-by: Evan Quan --- drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 55 ++- drivers/gpu/drm/amd/powerplay/navi10_ppt.c| 41 +++--- drivers/gpu/drm/amd/powerplay/renoir_ppt.c| 40 ++ .../drm/amd/powerplay/sienna_cichlid_ppt.c| 55 ++- drivers/gpu/drm/amd/powerplay/smu_cmn.c | 45 +++ drivers/gpu/drm/amd/powerplay/smu_cmn.h | 8 +++ 6 files changed, 77 insertions(+), 167 deletions(-) diff --git a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c index e59e6fb6f0a8..8b1025dc54fd 100644 --- a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c +++ b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c @@ -541,49 +541,6 @@ static int arcturus_freqs_in_same_level(int32_t frequency1, return (abs(frequency1 - frequency2) <= EPSILON); } -static int arcturus_get_metrics_table_locked(struct smu_context *smu, -SmuMetrics_t *metrics_table, -bool bypass_cache) -{ - struct smu_table_context *smu_table= >smu_table; - int ret = 0; - - if (bypass_cache || - !smu_table->metrics_time || - time_after(jiffies, smu_table->metrics_time + msecs_to_jiffies(1))) { - ret = smu_cmn_update_table(smu, - SMU_TABLE_SMU_METRICS, - 0, - smu_table->metrics_table, - false); - if (ret) { - dev_info(smu->adev->dev, "Failed to export SMU metrics table!\n"); - return ret; - } - smu_table->metrics_time = jiffies; - } - - if (metrics_table) - memcpy(metrics_table, smu_table->metrics_table, sizeof(SmuMetrics_t)); - - return 0; -} - -static int arcturus_get_metrics_table(struct smu_context *smu, - SmuMetrics_t *metrics_table, - bool bypass_cache) -{ - int ret = 0; - - mutex_lock(>metrics_lock); - ret = arcturus_get_metrics_table_locked(smu, - metrics_table, - bypass_cache); - mutex_unlock(>metrics_lock); - - return ret; -} - static int arcturus_get_smu_metrics_data(struct smu_context *smu, MetricsMember_t member, uint32_t *value) @@ -594,9 +551,9 @@ static int arcturus_get_smu_metrics_data(struct smu_context *smu, mutex_lock(>metrics_lock); - ret = arcturus_get_metrics_table_locked(smu, - NULL, - false); + ret = smu_cmn_get_metrics_table_locked(smu, + NULL, + false); if (ret) { mutex_unlock(>metrics_lock); return ret; @@ -2305,9 +2262,9 @@ static ssize_t arcturus_get_gpu_metrics(struct smu_context *smu, SmuMetrics_t metrics; int ret = 0; - ret = arcturus_get_metrics_table(smu, -, -true); + ret = smu_cmn_get_metrics_table(smu, + , + true); if (ret) return ret; diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c index 82659b781f05..42a60769c52f 100644 --- a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c +++ b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c @@ -504,34 +504,6 @@ static int navi10_tables_init(struct smu_context *smu) return -ENOMEM; } -static int navi10_get_metrics_table_locked(struct smu_context *smu, - SmuMetrics_t *metrics_table, - bool bypass_cache) -{ - struct smu_table_context *smu_table= >smu_table; - int ret = 0; - - if (bypass_cache || - !smu_table->metrics_time || - time_after(jiffies, smu_table->metrics_time + msecs_to_jiffies(1))) { - ret = smu_cmn_update_table(smu, - SMU_TABLE_SMU_METRICS, - 0, - smu_table->metrics_table, - false); - if (ret) { - dev_info(smu->adev->dev, "Failed to export SMU metrics table!\n"); - return ret; - } - smu_table->metrics_time =
[PATCH 1/2] drm/amd/powerplay: update the metrics table cache interval as 1ms
To make the setting same as Arcturus/Navi1x/Sienna_Cichlid. Change-Id: I7ea721bf5872023f1ab39c3827fb9c6fd05877cc Signed-off-by: Evan Quan --- drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c | 2 +- drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c | 2 +- drivers/gpu/drm/amd/powerplay/renoir_ppt.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c index c70c30175801..f0680dd58508 100644 --- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c @@ -1272,7 +1272,7 @@ static int vega12_get_metrics_table(struct pp_hwmgr *hwmgr, if (bypass_cache || !data->metrics_time || - time_after(jiffies, data->metrics_time + HZ / 2)) { + time_after(jiffies, data->metrics_time + msecs_to_jiffies(1))) { ret = smum_smc_table_manager(hwmgr, (uint8_t *)(>metrics_table), TABLE_SMU_METRICS, diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c index c9f402edc0d6..da84012b7fd5 100644 --- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c @@ -2082,7 +2082,7 @@ static int vega20_get_metrics_table(struct pp_hwmgr *hwmgr, if (bypass_cache || !data->metrics_time || - time_after(jiffies, data->metrics_time + HZ / 2)) { + time_after(jiffies, data->metrics_time + msecs_to_jiffies(1))) { ret = smum_smc_table_manager(hwmgr, (uint8_t *)(>metrics_table), TABLE_SMU_METRICS, diff --git a/drivers/gpu/drm/amd/powerplay/renoir_ppt.c b/drivers/gpu/drm/amd/powerplay/renoir_ppt.c index 8a8e6033f71f..c50c4547fea9 100644 --- a/drivers/gpu/drm/amd/powerplay/renoir_ppt.c +++ b/drivers/gpu/drm/amd/powerplay/renoir_ppt.c @@ -139,7 +139,7 @@ static int renoir_get_metrics_table(struct smu_context *smu, if (bypass_cache || !smu_table->metrics_time || - time_after(jiffies, smu_table->metrics_time + msecs_to_jiffies(100))) { + time_after(jiffies, smu_table->metrics_time + msecs_to_jiffies(1))) { ret = smu_cmn_update_table(smu, SMU_TABLE_SMU_METRICS, 0, (void *)smu_table->metrics_table, false); if (ret) { -- 2.28.0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 3/3] drm/amdgpu: add debugfs interface for RAP test
[AMD Public Use] One comment inline, please check. Regards, Guchun -Original Message- From: amd-gfx On Behalf Of Wenhui Sheng Sent: Tuesday, August 11, 2020 11:04 AM To: amd-gfx@lists.freedesktop.org Cc: Clements, John ; Sheng, Wenhui ; Zhang, Hawking Subject: [PATCH 3/3] drm/amdgpu: add debugfs interface for RAP test After amdgpu driver loading successfully, we can use RAP debugfs interface /dri/xxx/rap_test to trigger RAP test. Currently only L0 validate test is supported. v2: refine amdgpu_rap.h Signed-off-by: Wenhui Sheng --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 3 + drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c | 127 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.h | 28 + 4 files changed, 159 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.h diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile index 0ba396e9d7e4..dec1927ca75d 100644 --- a/drivers/gpu/drm/amd/amdgpu/Makefile +++ b/drivers/gpu/drm/amd/amdgpu/Makefile @@ -55,7 +55,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o \ amdgpu_gmc.o amdgpu_mmhub.o amdgpu_xgmi.o amdgpu_csa.o amdgpu_ras.o amdgpu_vm_cpu.o \ amdgpu_vm_sdma.o amdgpu_discovery.o amdgpu_ras_eeprom.o amdgpu_nbio.o \ - amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o + amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o amdgpu-$(CONFIG_PERF_EVENTS) += amdgpu_pmu.o diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index 3a4b31b1c4f2..0af249a1e35b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -34,6 +34,7 @@ #include "amdgpu_pm.h" #include "amdgpu_dm_debugfs.h" #include "amdgpu_ras.h" +#include "amdgpu_rap.h" /** * amdgpu_debugfs_add_files - Add simple debugfs entries @@ -1623,6 +1624,8 @@ int amdgpu_debugfs_init(struct amdgpu_device *adev) amdgpu_debugfs_autodump_init(adev); + amdgpu_rap_debugfs_init(adev); + return amdgpu_debugfs_add_files(adev, amdgpu_debugfs_list, ARRAY_SIZE(amdgpu_debugfs_list)); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c new file mode 100644 index ..46b2607a7bd1 --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c @@ -0,0 +1,127 @@ +/* + * Copyright 2020 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person +obtaining a + * copy of this software and associated documentation files (the +"Software"), + * to deal in the Software without restriction, including without +limitation + * the rights to use, copy, modify, merge, publish, distribute, +sublicense, + * and/or sell copies of the Software, and to permit persons to whom +the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be +included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT +SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, +DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR +OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE +OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * + */ +#include +#include + +#include "amdgpu.h" +#include "amdgpu_rap.h" + +/** + * DOC: AMDGPU RAP debugfs test interface + * + * how to use? + * echo opcode > /dri/xxx/rap_test + * + * opcode: + * currently, only 2 is supported by Linux host driver, + * opcode 2 stands for TA_CMD_RAP__VALIDATE_L0, used to + * trigger L0 policy validation, you can refer more detail + * from header file ta_rap_if.h + * + */ +static ssize_t amdgpu_rap_debugfs_write(struct file *f, const char __user *buf, + size_t size, loff_t *pos) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private; + struct ta_rap_shared_memory *rap_shared_mem; + struct ta_rap_cmd_output_data *rap_cmd_output; + struct drm_device *dev = adev->ddev; + uint32_t op; + int ret; + + if (*pos || size != 2) + return -EINVAL; + + ret = kstrtouint_from_user(buf, size, *pos, ); + if (ret) + return ret; + + ret = pm_runtime_get_sync(dev->dev); + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); + return ret; + } + + /* make sure gfx core is
[PATCH 3/3] drm/amdgpu: add debugfs interface for RAP test
After amdgpu driver loading successfully, we can use RAP debugfs interface /dri/xxx/rap_test to trigger RAP test. Currently only L0 validate test is supported. v2: refine amdgpu_rap.h Signed-off-by: Wenhui Sheng --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 3 + drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c | 127 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.h | 28 + 4 files changed, 159 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.h diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile index 0ba396e9d7e4..dec1927ca75d 100644 --- a/drivers/gpu/drm/amd/amdgpu/Makefile +++ b/drivers/gpu/drm/amd/amdgpu/Makefile @@ -55,7 +55,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ amdgpu_vf_error.o amdgpu_sched.o amdgpu_debugfs.o amdgpu_ids.o \ amdgpu_gmc.o amdgpu_mmhub.o amdgpu_xgmi.o amdgpu_csa.o amdgpu_ras.o amdgpu_vm_cpu.o \ amdgpu_vm_sdma.o amdgpu_discovery.o amdgpu_ras_eeprom.o amdgpu_nbio.o \ - amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o + amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o amdgpu-$(CONFIG_PERF_EVENTS) += amdgpu_pmu.o diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index 3a4b31b1c4f2..0af249a1e35b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -34,6 +34,7 @@ #include "amdgpu_pm.h" #include "amdgpu_dm_debugfs.h" #include "amdgpu_ras.h" +#include "amdgpu_rap.h" /** * amdgpu_debugfs_add_files - Add simple debugfs entries @@ -1623,6 +1624,8 @@ int amdgpu_debugfs_init(struct amdgpu_device *adev) amdgpu_debugfs_autodump_init(adev); + amdgpu_rap_debugfs_init(adev); + return amdgpu_debugfs_add_files(adev, amdgpu_debugfs_list, ARRAY_SIZE(amdgpu_debugfs_list)); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c new file mode 100644 index ..46b2607a7bd1 --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c @@ -0,0 +1,127 @@ +/* + * Copyright 2020 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * + */ +#include +#include + +#include "amdgpu.h" +#include "amdgpu_rap.h" + +/** + * DOC: AMDGPU RAP debugfs test interface + * + * how to use? + * echo opcode > /dri/xxx/rap_test + * + * opcode: + * currently, only 2 is supported by Linux host driver, + * opcode 2 stands for TA_CMD_RAP__VALIDATE_L0, used to + * trigger L0 policy validation, you can refer more detail + * from header file ta_rap_if.h + * + */ +static ssize_t amdgpu_rap_debugfs_write(struct file *f, const char __user *buf, + size_t size, loff_t *pos) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)file_inode(f)->i_private; + struct ta_rap_shared_memory *rap_shared_mem; + struct ta_rap_cmd_output_data *rap_cmd_output; + struct drm_device *dev = adev->ddev; + uint32_t op; + int ret; + + if (*pos || size != 2) + return -EINVAL; + + ret = kstrtouint_from_user(buf, size, *pos, ); + if (ret) + return ret; + + ret = pm_runtime_get_sync(dev->dev); + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); + return ret; + } + + /* make sure gfx core is on, RAP TA cann't handle +* GFX OFF case currently. +*/ + amdgpu_gfx_off_ctrl(adev, false); + + switch (op) { + case 2: + ret = psp_rap_invoke(>psp, op); + + if (ret == TA_RAP_STATUS__SUCCESS) { + dev_info(adev->dev, "RAP L0 validate test success.\n"); + } else { +
[PATCH 1/3] drm/amdgpu: add RAP TA header file
The RAP TA contains tests used to verify if RAP(Register Access Policy), or otherwise known as Security Policy is applied correctly by PSP BL The RAP test is a measure to ensure that we reduce the avenue of complexity and mistakes when dealing with RAP in post-si execution, where debugging failures related to RAP is quite difficult and expensive. v2: add introduction for RAP TA Signed-off-by: Wenhui Sheng --- drivers/gpu/drm/amd/amdgpu/ta_rap_if.h | 84 ++ 1 file changed, 84 insertions(+) create mode 100644 drivers/gpu/drm/amd/amdgpu/ta_rap_if.h diff --git a/drivers/gpu/drm/amd/amdgpu/ta_rap_if.h b/drivers/gpu/drm/amd/amdgpu/ta_rap_if.h new file mode 100644 index ..f14833fae07c --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/ta_rap_if.h @@ -0,0 +1,84 @@ +/* + * Copyright 2020 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ + +#ifndef _TA_RAP_IF_H +#define _TA_RAP_IF_H + +/* Responses have bit 31 set */ +#define RSP_ID_MASK (1U << 31) +#define RSP_ID(cmdId) (((uint32_t)(cmdId)) | RSP_ID_MASK) + +enum ta_rap_status { + TA_RAP_STATUS__SUCCESS = 1, + TA_RAP_STATUS__ERROR_GENERIC_FAILURE= 2, + TA_RAP_STATUS__ERROR_CMD_NOT_SUPPORTED = 3, + TA_RAP_STATUS__ERROR_INVALID_VALIDATION_METHOD = 4, + TA_RAP_STATUS__ERROR_NULL_POINTER = 5, + TA_RAP_STATUS__ERROR_NOT_INITIALIZED= 6, + TA_RAP_STATUS__ERROR_VALIDATION_FAILED = 7, + TA_RAP_STATUS__ERROR_ASIC_NOT_SUPPORTED = 8, + TA_RAP_STATUS__ERROR_OPERATION_NOT_PERMISSABLE = 9, + TA_RAP_STATUS__ERROR_ALREADY_INIT = 10, +}; + +enum ta_rap_cmd { + TA_CMD_RAP__INITIALIZE = 1, + TA_CMD_RAP__VALIDATE_L0 = 2, +}; + +enum ta_rap_validation_method { + METHOD_A = 1, +}; + +struct ta_rap_cmd_input_data { + uint8_t reserved[8]; +}; + +struct ta_rap_cmd_output_data { + uint32_tlast_subsection; + uint32_tnum_total_validate; + uint32_tnum_valid; + uint32_tlast_validate_addr; + uint32_tlast_validate_val; + uint32_tlast_validate_val_exptd; +}; + +union ta_rap_cmd_input { + struct ta_rap_cmd_input_data input; +}; + +union ta_rap_cmd_output { + struct ta_rap_cmd_output_data output; +}; + +struct ta_rap_shared_memory { + uint32_tcmd_id; + uint32_tvalidation_method_id; + uint32_tresp_id; + enum ta_rap_status rap_status; + union ta_rap_cmd_input rap_in_message; + union ta_rap_cmd_output rap_out_message; + uint8_t reserved[64]; +}; + +#endif // #define _TA_RAP_IF_H -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 2/3] drm/amdgpu: enable RAP TA load
Enable the RAP TA loading path and add RAP test trigger interface. v2: fix potential mem leak issue Signed-off-by: Wenhui Sheng --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 183 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 17 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h | 1 + 3 files changed, 201 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index c68369731b20..116a89990f39 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -1430,6 +1430,168 @@ static int psp_dtm_terminate(struct psp_context *psp) } // DTM end +// RAP start +static int psp_rap_init_shared_buf(struct psp_context *psp) +{ + int ret; + + /* +* Allocate 16k memory aligned to 4k from Frame Buffer (local +* physical) for rap ta <-> Driver +*/ + ret = amdgpu_bo_create_kernel(psp->adev, PSP_RAP_SHARED_MEM_SIZE, + PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM, + >rap_context.rap_shared_bo, + >rap_context.rap_shared_mc_addr, + >rap_context.rap_shared_buf); + + return ret; +} + +static int psp_rap_load(struct psp_context *psp) +{ + int ret; + struct psp_gfx_cmd_resp *cmd; + + cmd = kzalloc(sizeof(struct psp_gfx_cmd_resp), GFP_KERNEL); + if (!cmd) + return -ENOMEM; + + memset(psp->fw_pri_buf, 0, PSP_1_MEG); + memcpy(psp->fw_pri_buf, psp->ta_rap_start_addr, psp->ta_rap_ucode_size); + + psp_prep_ta_load_cmd_buf(cmd, +psp->fw_pri_mc_addr, +psp->ta_rap_ucode_size, +psp->rap_context.rap_shared_mc_addr, +PSP_RAP_SHARED_MEM_SIZE); + + ret = psp_cmd_submit_buf(psp, NULL, cmd, psp->fence_buf_mc_addr); + + if (!ret) { + psp->rap_context.rap_initialized = true; + psp->rap_context.session_id = cmd->resp.session_id; + mutex_init(>rap_context.mutex); + } + + kfree(cmd); + + return ret; +} + +static int psp_rap_unload(struct psp_context *psp) +{ + int ret; + struct psp_gfx_cmd_resp *cmd; + + cmd = kzalloc(sizeof(struct psp_gfx_cmd_resp), GFP_KERNEL); + if (!cmd) + return -ENOMEM; + + psp_prep_ta_unload_cmd_buf(cmd, psp->rap_context.session_id); + + ret = psp_cmd_submit_buf(psp, NULL, cmd, psp->fence_buf_mc_addr); + + kfree(cmd); + + return ret; +} + +static int psp_rap_initialize(struct psp_context *psp) +{ + int ret; + + /* +* TODO: bypass the initialize in sriov for now +*/ + if (amdgpu_sriov_vf(psp->adev)) + return 0; + + if (!psp->adev->psp.ta_rap_ucode_size || + !psp->adev->psp.ta_rap_start_addr) { + dev_info(psp->adev->dev, "RAP: optional rap ta ucode is not available\n"); + return 0; + } + + if (!psp->rap_context.rap_initialized) { + ret = psp_rap_init_shared_buf(psp); + if (ret) + return ret; + } + + ret = psp_rap_load(psp); + if (ret) + return ret; + + ret = psp_rap_invoke(psp, TA_CMD_RAP__INITIALIZE); + if (ret != TA_RAP_STATUS__SUCCESS) { + psp_rap_unload(psp); + + amdgpu_bo_free_kernel(>rap_context.rap_shared_bo, + >rap_context.rap_shared_mc_addr, + >rap_context.rap_shared_buf); + + psp->rap_context.rap_initialized = false; + + dev_warn(psp->adev->dev, "RAP TA initialize fail.\n"); + return -EINVAL; + } + + return 0; +} + +static int psp_rap_terminate(struct psp_context *psp) +{ + int ret; + + if (!psp->rap_context.rap_initialized) + return 0; + + ret = psp_rap_unload(psp); + + psp->rap_context.rap_initialized = false; + + /* free rap shared memory */ + amdgpu_bo_free_kernel(>rap_context.rap_shared_bo, + >rap_context.rap_shared_mc_addr, + >rap_context.rap_shared_buf); + + return ret; +} + +int psp_rap_invoke(struct psp_context *psp, uint32_t ta_cmd_id) +{ + struct ta_rap_shared_memory *rap_cmd; + int ret; + + if (!psp->rap_context.rap_initialized) + return -EINVAL; + + if (ta_cmd_id != TA_CMD_RAP__INITIALIZE && + ta_cmd_id != TA_CMD_RAP__VALIDATE_L0) + return -EINVAL; + + mutex_lock(>rap_context.mutex); + + rap_cmd = (struct ta_rap_shared_memory *) + psp->rap_context.rap_shared_buf; + memset(rap_cmd, 0, sizeof(struct ta_rap_shared_memory)); + + rap_cmd->cmd_id = ta_cmd_id;
[PATCH v3] drm/amdgpu: annotate a false positive recursive locking
[ 584.110304] [ 584.110590] WARNING: possible recursive locking detected [ 584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G OE [ 584.64] [ 584.111456] kworker/38:1/553 is trying to acquire lock: [ 584.111721] 9b15ff0a47a0 (>reset_sem){}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.112112] but task is already holding lock: [ 584.112673] 9b1603d247a0 (>reset_sem){}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.113068] other info that might help us debug this: [ 584.113689] Possible unsafe locking scenario: [ 584.114350]CPU0 [ 584.114685] [ 584.115014] lock(>reset_sem); [ 584.115349] lock(>reset_sem); [ 584.115678] *** DEADLOCK *** [ 584.116624] May be due to missing lock nesting notation [ 584.117284] 4 locks held by kworker/38:1/553: [ 584.117616] #0: 9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630 [ 584.117967] #1: ac708e1c3e58 ((work_completion)(>recovery_work)){+.+.}, at: process_one_work+0x21f/0x630 [ 584.118358] #2: c1c2a5d0 (>hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [ 584.118786] #3: 9b1603d247a0 (>reset_sem){}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.119222] stack backtrace: [ 584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G OE 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 [ 584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 584.121223] Workqueue: events amdgpu_ras_do_recovery [amdgpu] [ 584.121638] Call Trace: [ 584.122050] dump_stack+0x98/0xd5 [ 584.122499] __lock_acquire+0x1139/0x16e0 [ 584.122931] ? trace_hardirqs_on+0x3b/0xf0 [ 584.123358] ? cancel_delayed_work+0xa6/0xc0 [ 584.123771] lock_acquire+0xb8/0x1c0 [ 584.124197] ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599] down_write+0x49/0x120 [ 584.125032] ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.125472] amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.125910] ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [ 584.126367] amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [ 584.126789] process_one_work+0x29e/0x630 [ 584.127208] worker_thread+0x3c/0x3f0 [ 584.127621] ? __kthread_parkme+0x61/0x90 [ 584.128014] kthread+0x12f/0x150 [ 584.128402] ? process_one_work+0x630/0x630 [ 584.128790] ? kthread_park+0x90/0x90 [ 584.129174] ret_from_fork+0x3a/0x50 Each adev has owned lock_class_key to avoid false positive recursive locking. v2: 1. register adev->lock_key into lockdep, otherwise lockdep will report the below warning [ 1216.705820] BUG: key 890183b647d0 has not been registered! [ 1216.705924] [ cut here ] [ 1216.705972] DEBUG_LOCKS_WARN_ON(1) [ 1216.705997] WARNING: CPU: 20 PID: 541 at kernel/locking/lockdep.c:3743 lockdep_init_map+0x150/0x210 v3: change to use down_write_nest_lock to annotate the false dead-lock warning. Signed-off-by: Dennis Li diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 62ecac97fbd2..8a55b0bc044a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4145,12 +4145,15 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive, return r; } -static bool amdgpu_device_lock_adev(struct amdgpu_device *adev) +static bool amdgpu_device_lock_adev(struct amdgpu_device *adev, struct amdgpu_hive_info *hive) { if (atomic_cmpxchg(>in_gpu_reset, 0, 1) != 0) return false; - down_write(>reset_sem); + if (hive) { + down_write_nest_lock(>reset_sem, >hive_lock); + } else + down_write(>reset_sem); atomic_inc(>gpu_reset_counter); switch (amdgpu_asic_reset_method(adev)) { @@ -4312,7 +4315,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, /* block all schedulers and reset given job's ring */ list_for_each_entry(tmp_adev, device_list_handle, gmc.xgmi.head) { - if (!amdgpu_device_lock_adev(tmp_adev)) { + if (!amdgpu_device_lock_adev(tmp_adev, hive)) { DRM_INFO("Bailing on TDR for s_job:%llx, as another already in progress", job ? job->base.id : -1); r = 0; -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm: amdgpu: Use the correct size when allocating memory
Applied and updated the commit message to reflect the sizes. Thanks! Alex On Mon, Aug 10, 2020 at 3:07 PM Marion & Christophe JAILLET wrote: > > > Le 10/08/2020 à 17:42, Dan Carpenter a écrit : > > On Sun, Aug 09, 2020 at 10:34:06PM +0200, Christophe JAILLET wrote: > >> When '*sgt' is allocated, we must allocated 'sizeof(**sgt)' bytes instead > >> of 'sizeof(*sg)'. 'sg' (i.e. struct scatterlist) is smaller than > >> 'sgt' (i.e struct sg_table), so this could lead to memory corruption. > > The sizeof(*sg) is bigger than sizeof(**sgt) so this wastes memory but > > it won't lead to corruption. > > > > 11 struct scatterlist { > > 12 unsigned long page_link; > > 13 unsigned intoffset; > > 14 unsigned intlength; > > 15 dma_addr_t dma_address; > > 16 #ifdef CONFIG_NEED_SG_DMA_LENGTH > > 17 unsigned intdma_length; > > 18 #endif > > 19 }; > > > > 42 struct sg_table { > > 43 struct scatterlist *sgl;/* the list */ > > 44 unsigned int nents; /* number of mapped > > entries */ > > 45 unsigned int orig_nents;/* original size of list */ > > 46 }; > > > > regards, > > dan carpenter > > > My bad. I read 'struct scatterlist sgl' (without the *) > Thanks for the follow-up, Dan. > > Doesn't smatch catch such mismatch? > (I've not run smatch for a while, so it is maybe reported) > > Well, the proposal is still valid, even if it has less impact as > initially thought. > > Thx for the review. > > CJ > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/display: convert to use le16_add_cpu()
Applied. Thanks! Alex On Mon, Aug 10, 2020 at 9:05 AM Qinglang Miao wrote: > > Convert cpu_to_le16(le16_to_cpu(E1) + E2) to use le16_add_cpu(). > > Signed-off-by: Qinglang Miao > --- > drivers/gpu/drm/amd/display/dc/bios/command_table.c | 4 +--- > drivers/gpu/drm/amd/display/dc/bios/command_table2.c | 5 + > 2 files changed, 2 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table.c > b/drivers/gpu/drm/amd/display/dc/bios/command_table.c > index 5815983ca..070459e3e 100644 > --- a/drivers/gpu/drm/amd/display/dc/bios/command_table.c > +++ b/drivers/gpu/drm/amd/display/dc/bios/command_table.c > @@ -1877,9 +1877,7 @@ static enum bp_result set_crtc_using_dtd_timing_v3( > * but it is 4 either from Edid data (spec CEA 861) > * or CEA timing table. > */ > - params.usV_SyncOffset = > - > cpu_to_le16(le16_to_cpu(params.usV_SyncOffset) + 1); > - > + le16_add_cpu(_SyncOffset, 1); > } > } > > diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c > b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c > index bed91572f..e8f52eb8e 100644 > --- a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c > +++ b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c > @@ -569,10 +569,7 @@ static enum bp_result set_crtc_using_dtd_timing_v3( > * but it is 4 either from Edid data (spec CEA 861) > * or CEA timing table. > */ > - params.v_syncoffset = > - cpu_to_le16(le16_to_cpu(params.v_syncoffset) + > - 1); > - > + le16_add_cpu(_syncoffset, 1); > } > } > > -- > 2.25.1 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: Non-deterministically boot into dark screen with `amdgpu`
On Mon, Aug 10, 2020 at 7:46 AM Christian König wrote: > > Hi guys, > > Am 10.08.20 um 08:43 schrieb Alexander Monakov: > > Hi, > > you should Сс a specialized mailing list and a relevant maintainer, > otherwise your email is likely to be ignored as LKML is an incredibly > high-volume list. Adding amd-gfx and Alex Deucher. > > > Thanks for forwarding this. AFAIK we haven't heard of this bug before, but > Alex already might know more about it. > > More thoughts below. > > On Sun, 9 Aug 2020, Ignat Insarov wrote: > > Hello! > > This is an issue report. I am not familiar with the Linux kernel > development procedure, so please direct me to a more appropriate or > specialized medium if this is not the right avenue. > > My laptop (Ryzen 7 Pro CPU/GPU) boots into dark screen more often than > not. Screen blackness correlates with a line in the `systemd` journal > that says `RAM width Nbits DDR4`, where N is either 128 (resulting in > dark screen) or 64 (resulting in a healthy boot). The number seems to > be chosen at random with bias towards 128. This has been going on for > a while so here is some statistics: > > * 356 boots proceed far enough to attempt mode setting. > * 82 boots set RAM width to 64 bits and presumably succeed. > * 274 boots set RAM width to 128 bits and presumably fail. > > The issue is prevented with the `nomodeset` kernel option. > > I reported this previously (about a year ago) on the forum of my Linux > distribution.[1] The issue still persists as of linux 5.8.0. > > The details of my graphics controller, as well as some journal > excerpts, can be seen at [1]. One thing that has changed since then is > that on failure, there now appears a null pointer dereference error. I > am attaching the log of kernel messages from the most recent failed > boot — please request more information if needed. > > I appreciate any directions and advice as to how I may go about fixing > this annoyance. > > [1]: https://bbs.archlinux.org/viewtopic.php?id=248273 > > On the forum you show that in the "success" case there's one less "BIOS > signature incorrect" message. This implies that amdgpu_get_bios() in > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c > gets the video BIOS from a different source. If that happens every time > (one "signature incorrect" message for "success", two for "failure") > that may be relevant to the problem you're experiencing. > > If you don't mind patching and rebuilding the kernel I suggest adding > debug printks to the aforementioned function to see exactly which methods > fail with wrong signature and which succeeds. > > Also might be worthwhile to check if there's a BIOS update for your laptop. > > > It might also be a good idea to try the latest amd-staging-drm-next branch > from Alex repository (bear with me I don't have the link at hand, but it > should be easy to find). > > Opening a bug report or searching the existing ones for something similar > under https://gitlab.freedesktop.org/drm/amd/-/issues might be a good idea as > well. > > And I completely agree that this sounds like an issue getting the BIOS image. I've not heard of an issue like this either. Best to file a gitlab bug and attach your full dmesg output in both the working and non-working cases and we can go from there. Alex > > Thanks, > Christian. > > > Alexander > > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 4.9 04/17] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync
From: Aditya Pakki [ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ] On calling pm_runtime_get_sync() the reference count of the device is incremented. In case of failure, decrement the reference count before returning the error. Acked-by: Evan Quan Signed-off-by: Aditya Pakki Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/radeon_display.c | 4 +++- drivers/gpu/drm/radeon/radeon_drv.c | 4 +++- drivers/gpu/drm/radeon/radeon_kms.c | 4 +++- 3 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c index 432ad7d73cb9b..99e23800cadc7 100644 --- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -639,8 +639,10 @@ radeon_crtc_set_config(struct drm_mode_set *set) dev = set->crtc->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_crtc_helper_set_config(set); diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 30bd4a6a9d466..7648fd0d10751 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -496,8 +496,10 @@ long radeon_drm_ioctl(struct file *filp, long ret; dev = file_priv->minor->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_ioctl(filp, cmd, arg); diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index 4388ddeec8d24..96d2a564d9a3c 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -634,8 +634,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv) file_priv->driver_priv = NULL; r = pm_runtime_get_sync(dev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(dev->dev); return r; + } /* new gpu have virtual address space support */ if (rdev->family >= CHIP_CAYMAN) { -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 4.4 04/16] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync
From: Aditya Pakki [ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ] On calling pm_runtime_get_sync() the reference count of the device is incremented. In case of failure, decrement the reference count before returning the error. Acked-by: Evan Quan Signed-off-by: Aditya Pakki Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/radeon_display.c | 4 +++- drivers/gpu/drm/radeon/radeon_drv.c | 4 +++- drivers/gpu/drm/radeon/radeon_kms.c | 4 +++- 3 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c index 4572bfba017c5..17c73b8c90e71 100644 --- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -660,8 +660,10 @@ radeon_crtc_set_config(struct drm_mode_set *set) dev = set->crtc->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_crtc_helper_set_config(set); diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 5b6a6f5b3619e..401403a3ea50c 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -527,8 +527,10 @@ long radeon_drm_ioctl(struct file *filp, long ret; dev = file_priv->minor->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_ioctl(filp, cmd, arg); diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index d290a8a09036e..41caf7da90548 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -631,8 +631,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv) file_priv->driver_priv = NULL; r = pm_runtime_get_sync(dev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(dev->dev); return r; + } /* new gpu have virtual address space support */ if (rdev->family >= CHIP_CAYMAN) { -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 4.14 04/22] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync
From: Aditya Pakki [ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ] On calling pm_runtime_get_sync() the reference count of the device is incremented. In case of failure, decrement the reference count before returning the error. Acked-by: Evan Quan Signed-off-by: Aditya Pakki Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/radeon_display.c | 4 +++- drivers/gpu/drm/radeon/radeon_drv.c | 4 +++- drivers/gpu/drm/radeon/radeon_kms.c | 4 +++- 3 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c index d86110cdf0852..b2334349799d1 100644 --- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -627,8 +627,10 @@ radeon_crtc_set_config(struct drm_mode_set *set, dev = set->crtc->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_crtc_helper_set_config(set, ctx); diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index f6908e2f9e55a..41e8abd099784 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -496,8 +496,10 @@ long radeon_drm_ioctl(struct file *filp, long ret; dev = file_priv->minor->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_ioctl(filp, cmd, arg); diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index dfee8f7d94ae5..2e28cf8118404 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -659,8 +659,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv) file_priv->driver_priv = NULL; r = pm_runtime_get_sync(dev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(dev->dev); return r; + } /* new gpu have virtual address space support */ if (rdev->family >= CHIP_CAYMAN) { -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 4.19 16/31] drm/radeon: disable AGP by default
From: Christian König [ Upstream commit ba806f98f868ce107aa9c453fef751de9980e4af ] Always use the PCI GART instead. We just have to many cases where AGP still causes problems. This means a performance regression for some GPUs, but also a bug fix for some others. Signed-off-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/radeon_drv.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 54729acd0d4af..0cd33289c2b63 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -168,12 +168,7 @@ int radeon_no_wb; int radeon_modeset = -1; int radeon_dynclks = -1; int radeon_r4xx_atom = 0; -#ifdef __powerpc__ -/* Default to PCI on PowerPC (fdo #95017) */ int radeon_agpmode = -1; -#else -int radeon_agpmode = 0; -#endif int radeon_vram_limit = 0; int radeon_gart_size = -1; /* auto */ int radeon_benchmarking = 0; -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 4.19 06/31] drm/amdgpu: avoid dereferencing a NULL pointer
From: Jack Xiao [ Upstream commit 55611b507fd6453d26030c0c0619fdf0c262766d ] Check if irq_src is NULL to avoid dereferencing a NULL pointer, for MES ring is uneccessary to recieve an interrupt notification. Signed-off-by: Jack Xiao Acked-by: Alex Deucher Reviewed-by: Hawking Zhang Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 869ff624b108c..e5e51e4d4f3d8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -396,7 +396,9 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring, ring->fence_drv.gpu_addr = adev->uvd.inst[ring->me].gpu_addr + index; } amdgpu_fence_write(ring, atomic_read(>fence_drv.last_seq)); - amdgpu_irq_get(adev, irq_src, irq_type); + + if (irq_src) + amdgpu_irq_get(adev, irq_src, irq_type); ring->fence_drv.irq_src = irq_src; ring->fence_drv.irq_type = irq_type; @@ -508,8 +510,9 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev) /* no need to trigger GPU reset as we are unloading */ amdgpu_fence_driver_force_completion(ring); } - amdgpu_irq_put(adev, ring->fence_drv.irq_src, - ring->fence_drv.irq_type); + if (ring->fence_drv.irq_src) + amdgpu_irq_put(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); drm_sched_fini(>sched); del_timer_sync(>fence_drv.fallback_timer); for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j) @@ -545,8 +548,9 @@ void amdgpu_fence_driver_suspend(struct amdgpu_device *adev) } /* disable the interrupt */ - amdgpu_irq_put(adev, ring->fence_drv.irq_src, - ring->fence_drv.irq_type); + if (ring->fence_drv.irq_src) + amdgpu_irq_put(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); } } @@ -572,8 +576,9 @@ void amdgpu_fence_driver_resume(struct amdgpu_device *adev) continue; /* enable the interrupt */ - amdgpu_irq_get(adev, ring->fence_drv.irq_src, - ring->fence_drv.irq_type); + if (ring->fence_drv.irq_src) + amdgpu_irq_get(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); } } -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.4 28/45] drm/amd/powerplay: fix compile error with ARCH=arc
From: Evan Quan [ Upstream commit 9822ba2ead1baa3de4860ad9472f652c4cc78c9c ] Fix the compile error below: drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c: In function 'smu_v11_0_init_microcode': >> arch/arc/include/asm/bug.h:22:2: error: implicit declaration of function >> 'pr_warn'; did you mean 'pci_warn'? [-Werror=implicit-function-declaration] 22 | pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ | ^~~ drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:176:3: note: in expansion of macro 'BUG' 176 | BUG(); Reported-by: kernel test robot Signed-off-by: Evan Quan Acked-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c index 0922d9cd858a0..c4d8c52c6b9ca 100644 --- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c +++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c @@ -171,7 +171,8 @@ static int smu_v11_0_init_microcode(struct smu_context *smu) chip_name = "navi12"; break; default: - BUG(); + dev_err(adev->dev, "Unsupported ASIC type %d\n", adev->asic_type); + return -EINVAL; } snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_smc.bin", chip_name); -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 4.19 07/31] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync
From: Aditya Pakki [ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ] On calling pm_runtime_get_sync() the reference count of the device is incremented. In case of failure, decrement the reference count before returning the error. Acked-by: Evan Quan Signed-off-by: Aditya Pakki Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/radeon_display.c | 4 +++- drivers/gpu/drm/radeon/radeon_drv.c | 4 +++- drivers/gpu/drm/radeon/radeon_kms.c | 4 +++- 3 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c index 7d1e14f0140a2..3f0f3a578ddf0 100644 --- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -625,8 +625,10 @@ radeon_crtc_set_config(struct drm_mode_set *set, dev = set->crtc->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_crtc_helper_set_config(set, ctx); diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index c26f09b47ecb2..54729acd0d4af 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -523,8 +523,10 @@ long radeon_drm_ioctl(struct file *filp, long ret; dev = file_priv->minor->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_ioctl(filp, cmd, arg); diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index 3ff835767ac58..34b3cb6c146f9 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -627,8 +627,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv) file_priv->driver_priv = NULL; r = pm_runtime_get_sync(dev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(dev->dev); return r; + } /* new gpu have virtual address space support */ if (rdev->family >= CHIP_CAYMAN) { -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.4 27/45] drm/amdgpu/display bail early in dm_pp_get_static_clocks
From: Alex Deucher [ Upstream commit 376814f5fcf1aadda501d1413d56e8af85d19a97 ] If there are no supported callbacks. We'll fall back to the nominal clocks. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1170 Reviewed-by: Evan Quan Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c index 785322cd4c6c9..7241d4c207789 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c @@ -530,6 +530,8 @@ bool dm_pp_get_static_clocks( _clk_info); else if (adev->smu.funcs) ret = smu_get_current_clocks(>smu, _clk_info); + else + return false; if (ret) return false; -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.4 24/45] drm/radeon: disable AGP by default
From: Christian König [ Upstream commit ba806f98f868ce107aa9c453fef751de9980e4af ] Always use the PCI GART instead. We just have to many cases where AGP still causes problems. This means a performance regression for some GPUs, but also a bug fix for some others. Signed-off-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/radeon_drv.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 7d417b9a52501..c2573096d43c0 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -174,12 +174,7 @@ int radeon_no_wb; int radeon_modeset = -1; int radeon_dynclks = -1; int radeon_r4xx_atom = 0; -#ifdef __powerpc__ -/* Default to PCI on PowerPC (fdo #95017) */ int radeon_agpmode = -1; -#else -int radeon_agpmode = 0; -#endif int radeon_vram_limit = 0; int radeon_gart_size = -1; /* auto */ int radeon_benchmarking = 0; -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.4 07/45] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync
From: Aditya Pakki [ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ] On calling pm_runtime_get_sync() the reference count of the device is incremented. In case of failure, decrement the reference count before returning the error. Acked-by: Evan Quan Signed-off-by: Aditya Pakki Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/radeon_display.c | 4 +++- drivers/gpu/drm/radeon/radeon_drv.c | 4 +++- drivers/gpu/drm/radeon/radeon_kms.c | 4 +++- 3 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c index 0826efd9b5f51..f9f74150d0d73 100644 --- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -631,8 +631,10 @@ radeon_crtc_set_config(struct drm_mode_set *set, dev = set->crtc->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_crtc_helper_set_config(set, ctx); diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 6128792ab8836..7d417b9a52501 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -555,8 +555,10 @@ long radeon_drm_ioctl(struct file *filp, long ret; dev = file_priv->minor->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_ioctl(filp, cmd, arg); diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index 2bb0187c5bc78..709c4ef5e7d59 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -638,8 +638,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv) file_priv->driver_priv = NULL; r = pm_runtime_get_sync(dev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(dev->dev); return r; + } /* new gpu have virtual address space support */ if (rdev->family >= CHIP_CAYMAN) { -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.7 44/60] drm/amd/powerplay: suppress compile error around BUG_ON
From: Evan Quan [ Upstream commit 75bc07e2403caea9ecac69f766dfb7dc33547594 ] To suppress the compile error below for "ARCH=arc". drivers/gpu/drm/amd/amdgpu/../powerplay/arcturus_ppt.c: In function 'arcturus_fill_eeprom_i2c_req': >> arch/arc/include/asm/bug.h:22:2: error: implicit declaration of function >> 'pr_warn'; did you mean 'pci_warn'? [-Werror=implicit-function-declaration] 22 | pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ | ^~~ include/asm-generic/bug.h:62:57: note: in expansion of macro 'BUG' 62 | #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0) | ^~~ drivers/gpu/drm/amd/amdgpu/../powerplay/arcturus_ppt.c:2157:2: note: in expansion of macro 'BUG_ON' 2157 | BUG_ON(numbytes > MAX_SW_I2C_COMMANDS); Signed-off-by: Evan Quan Acked-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c index 1ef0923f71906..9ad0e6f18be49 100644 --- a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c +++ b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c @@ -2035,8 +2035,6 @@ static void arcturus_fill_eeprom_i2c_req(SwI2cRequest_t *req, bool write, { int i; - BUG_ON(numbytes > MAX_SW_I2C_COMMANDS); - req->I2CcontrollerPort = 0; req->I2CSpeed = 2; req->SlaveAddress = address; @@ -2074,6 +2072,12 @@ static int arcturus_i2c_eeprom_read_data(struct i2c_adapter *control, struct smu_table_context *smu_table = >smu.smu_table; struct smu_table *table = _table->driver_table; + if (numbytes > MAX_SW_I2C_COMMANDS) { + dev_err(adev->dev, "numbytes requested %d is over max allowed %d\n", + numbytes, MAX_SW_I2C_COMMANDS); + return -EINVAL; + } + memset(, 0, sizeof(req)); arcturus_fill_eeprom_i2c_req(, false, address, numbytes, data); @@ -2110,6 +2114,12 @@ static int arcturus_i2c_eeprom_write_data(struct i2c_adapter *control, SwI2cRequest_t req; struct amdgpu_device *adev = to_amdgpu_device(control); + if (numbytes > MAX_SW_I2C_COMMANDS) { + dev_err(adev->dev, "numbytes requested %d is over max allowed %d\n", + numbytes, MAX_SW_I2C_COMMANDS); + return -EINVAL; + } + memset(, 0, sizeof(req)); arcturus_fill_eeprom_i2c_req(, true, address, numbytes, data); -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.4 06/45] drm/amdgpu: avoid dereferencing a NULL pointer
From: Jack Xiao [ Upstream commit 55611b507fd6453d26030c0c0619fdf0c262766d ] Check if irq_src is NULL to avoid dereferencing a NULL pointer, for MES ring is uneccessary to recieve an interrupt notification. Signed-off-by: Jack Xiao Acked-by: Alex Deucher Reviewed-by: Hawking Zhang Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 23085b352cf2d..c212d5fc665c6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -404,7 +404,9 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring, ring->fence_drv.gpu_addr = adev->uvd.inst[ring->me].gpu_addr + index; } amdgpu_fence_write(ring, atomic_read(>fence_drv.last_seq)); - amdgpu_irq_get(adev, irq_src, irq_type); + + if (irq_src) + amdgpu_irq_get(adev, irq_src, irq_type); ring->fence_drv.irq_src = irq_src; ring->fence_drv.irq_type = irq_type; @@ -539,8 +541,9 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev) /* no need to trigger GPU reset as we are unloading */ amdgpu_fence_driver_force_completion(ring); } - amdgpu_irq_put(adev, ring->fence_drv.irq_src, - ring->fence_drv.irq_type); + if (ring->fence_drv.irq_src) + amdgpu_irq_put(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); drm_sched_fini(>sched); del_timer_sync(>fence_drv.fallback_timer); for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j) @@ -576,8 +579,9 @@ void amdgpu_fence_driver_suspend(struct amdgpu_device *adev) } /* disable the interrupt */ - amdgpu_irq_put(adev, ring->fence_drv.irq_src, - ring->fence_drv.irq_type); + if (ring->fence_drv.irq_src) + amdgpu_irq_put(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); } } @@ -603,8 +607,9 @@ void amdgpu_fence_driver_resume(struct amdgpu_device *adev) continue; /* enable the interrupt */ - amdgpu_irq_get(adev, ring->fence_drv.irq_src, - ring->fence_drv.irq_type); + if (ring->fence_drv.irq_src) + amdgpu_irq_get(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); } } -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.7 35/60] drm/amd/powerplay: fix compile error with ARCH=arc
From: Evan Quan [ Upstream commit 9822ba2ead1baa3de4860ad9472f652c4cc78c9c ] Fix the compile error below: drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c: In function 'smu_v11_0_init_microcode': >> arch/arc/include/asm/bug.h:22:2: error: implicit declaration of function >> 'pr_warn'; did you mean 'pci_warn'? [-Werror=implicit-function-declaration] 22 | pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ | ^~~ drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:176:3: note: in expansion of macro 'BUG' 176 | BUG(); Reported-by: kernel test robot Signed-off-by: Evan Quan Acked-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c index 655ba4fb05dcd..48af305d42d54 100644 --- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c +++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c @@ -159,7 +159,8 @@ int smu_v11_0_init_microcode(struct smu_context *smu) chip_name = "navi12"; break; default: - BUG(); + dev_err(adev->dev, "Unsupported ASIC type %d\n", adev->asic_type); + return -EINVAL; } snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_smc.bin", chip_name); -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.7 28/60] drm/radeon: disable AGP by default
From: Christian König [ Upstream commit ba806f98f868ce107aa9c453fef751de9980e4af ] Always use the PCI GART instead. We just have to many cases where AGP still causes problems. This means a performance regression for some GPUs, but also a bug fix for some others. Signed-off-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/radeon_drv.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 0cf0b00a0623e..6f0d1971099b7 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -171,12 +171,7 @@ int radeon_no_wb; int radeon_modeset = -1; int radeon_dynclks = -1; int radeon_r4xx_atom = 0; -#ifdef __powerpc__ -/* Default to PCI on PowerPC (fdo #95017) */ int radeon_agpmode = -1; -#else -int radeon_agpmode = 0; -#endif int radeon_vram_limit = 0; int radeon_gart_size = -1; /* auto */ int radeon_benchmarking = 0; -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.7 09/60] drm/amdgpu: avoid dereferencing a NULL pointer
From: Jack Xiao [ Upstream commit 55611b507fd6453d26030c0c0619fdf0c262766d ] Check if irq_src is NULL to avoid dereferencing a NULL pointer, for MES ring is uneccessary to recieve an interrupt notification. Signed-off-by: Jack Xiao Acked-by: Alex Deucher Reviewed-by: Hawking Zhang Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 7531527067dfb..892c1e9a1eb04 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -408,7 +408,9 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring, ring->fence_drv.gpu_addr = adev->uvd.inst[ring->me].gpu_addr + index; } amdgpu_fence_write(ring, atomic_read(>fence_drv.last_seq)); - amdgpu_irq_get(adev, irq_src, irq_type); + + if (irq_src) + amdgpu_irq_get(adev, irq_src, irq_type); ring->fence_drv.irq_src = irq_src; ring->fence_drv.irq_type = irq_type; @@ -529,8 +531,9 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev) /* no need to trigger GPU reset as we are unloading */ amdgpu_fence_driver_force_completion(ring); } - amdgpu_irq_put(adev, ring->fence_drv.irq_src, - ring->fence_drv.irq_type); + if (ring->fence_drv.irq_src) + amdgpu_irq_put(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); drm_sched_fini(>sched); del_timer_sync(>fence_drv.fallback_timer); for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j) @@ -566,8 +569,9 @@ void amdgpu_fence_driver_suspend(struct amdgpu_device *adev) } /* disable the interrupt */ - amdgpu_irq_put(adev, ring->fence_drv.irq_src, - ring->fence_drv.irq_type); + if (ring->fence_drv.irq_src) + amdgpu_irq_put(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); } } @@ -593,8 +597,9 @@ void amdgpu_fence_driver_resume(struct amdgpu_device *adev) continue; /* enable the interrupt */ - amdgpu_irq_get(adev, ring->fence_drv.irq_src, - ring->fence_drv.irq_type); + if (ring->fence_drv.irq_src) + amdgpu_irq_get(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); } } -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.7 34/60] drm/amdgpu/display: properly guard the calls to swSMU functions
From: Alex Deucher [ Upstream commit 4072327a2622af8688b88f5cd0a472136d3bf33d ] It's only applicable on newer asics. We could end up here when using DC on older asics like SI or KV. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1170 Reviewed-by: Evan Quan Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c index 7cee8070cb113..5c6a6ae48d396 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c @@ -106,7 +106,7 @@ bool dm_pp_apply_display_requirements( adev->powerplay.pp_funcs->display_configuration_change( adev->powerplay.pp_handle, >pm.pm_display_cfg); - else + else if (adev->smu.ppt_funcs) smu_display_configuration_change(smu, >pm.pm_display_cfg); @@ -592,7 +592,7 @@ void pp_rv_set_wm_ranges(struct pp_smu *pp, if (pp_funcs && pp_funcs->set_watermarks_for_clocks_ranges) pp_funcs->set_watermarks_for_clocks_ranges(pp_handle, _with_clock_ranges); - else + else if (adev->smu.ppt_funcs) smu_set_watermarks_for_clock_ranges(>smu, _with_clock_ranges); } -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.7 32/60] drm/amd/display: Improve DisplayPort monitor interop
From: Aric Cyr [ Upstream commit eec3303de3378cdfaa0bb86f43546dbbd88f94e2 ] [Why] DC is very fast at link training and stream enablement which causes issues such as blackscreens for non-compliant monitors. [How] After debugging with scaler vendors we implement the minimum delays at the necessary locations to ensure the monitor does not hang. Delays are generic due to lack of IEEE OUI information on the failing displays. Signed-off-by: Aric Cyr Reviewed-by: Wenjing Liu Acked-by: Qingqing Zhuo Acked-by: Tony Cheng Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/display/dc/core/dc_link.c| 4 +++- drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 16 ++-- .../amd/display/dc/dce110/dce110_hw_sequencer.c | 11 ++- 3 files changed, 23 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c index 67cfff1586e9f..3f157bcc174b9 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c @@ -3146,9 +3146,11 @@ void core_link_disable_stream(struct pipe_ctx *pipe_ctx) write_i2c_redriver_setting(pipe_ctx, false); } } - dc->hwss.disable_stream(pipe_ctx); disable_link(pipe_ctx->stream->link, pipe_ctx->stream->signal); + + dc->hwss.disable_stream(pipe_ctx); + if (pipe_ctx->stream->timing.flags.DSC) { if (dc_is_dp_signal(pipe_ctx->stream->signal)) dp_set_dsc_enable(pipe_ctx, false); diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c index caa090d0b6acc..1ada01322cd2c 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c @@ -1103,6 +1103,10 @@ static inline enum link_training_result perform_link_training_int( dpcd_pattern.v1_4.TRAINING_PATTERN_SET = DPCD_TRAINING_PATTERN_VIDEOIDLE; dpcd_set_training_pattern(link, dpcd_pattern); + /* delay 5ms after notifying sink of idle pattern before switching output */ + if (link->connector_signal != SIGNAL_TYPE_EDP) + msleep(5); + /* 4. mainlink output idle pattern*/ dp_set_hw_test_pattern(link, DP_TEST_PATTERN_VIDEO_MODE, NULL, 0); @@ -1552,6 +1556,12 @@ bool perform_link_training_with_retries( struct dc_link *link = stream->link; enum dp_panel_mode panel_mode = dp_get_panel_mode(link); + /* We need to do this before the link training to ensure the idle pattern in SST +* mode will be sent right after the link training +*/ + link->link_enc->funcs->connect_dig_be_to_fe(link->link_enc, + pipe_ctx->stream_res.stream_enc->id, true); + for (j = 0; j < attempts; ++j) { dp_enable_link_phy( @@ -1568,12 +1578,6 @@ bool perform_link_training_with_retries( dp_set_panel_mode(link, panel_mode); - /* We need to do this before the link training to ensure the idle pattern in SST -* mode will be sent right after the link training -*/ - link->link_enc->funcs->connect_dig_be_to_fe(link->link_enc, - pipe_ctx->stream_res.stream_enc->id, true); - if (link->aux_access_disabled) { dc_link_dp_perform_link_training_skip_aux(link, link_setting); return true; diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c index 10527593868cc..24ca592c90df5 100644 --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c @@ -1090,8 +1090,17 @@ void dce110_blank_stream(struct pipe_ctx *pipe_ctx) dc_link_set_abm_disable(link); } - if (dc_is_dp_signal(pipe_ctx->stream->signal)) + if (dc_is_dp_signal(pipe_ctx->stream->signal)) { pipe_ctx->stream_res.stream_enc->funcs->dp_blank(pipe_ctx->stream_res.stream_enc); + + /* +* After output is idle pattern some sinks need time to recognize the stream +* has changed or they enter protection state and hang. +*/ + if (!dc_is_embedded_signal(pipe_ctx->stream->signal)) + msleep(60); + } + } -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.7 33/60] drm/amdgpu/display bail early in dm_pp_get_static_clocks
From: Alex Deucher [ Upstream commit 376814f5fcf1aadda501d1413d56e8af85d19a97 ] If there are no supported callbacks. We'll fall back to the nominal clocks. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1170 Reviewed-by: Evan Quan Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c index a2e1a73f66b81..7cee8070cb113 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c @@ -530,6 +530,8 @@ bool dm_pp_get_static_clocks( _clk_info); else if (adev->smu.ppt_funcs) ret = smu_get_current_clocks(>smu, _clk_info); + else + return false; if (ret) return false; -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.7 10/60] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync
From: Aditya Pakki [ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ] On calling pm_runtime_get_sync() the reference count of the device is incremented. In case of failure, decrement the reference count before returning the error. Acked-by: Evan Quan Signed-off-by: Aditya Pakki Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/radeon_display.c | 4 +++- drivers/gpu/drm/radeon/radeon_drv.c | 4 +++- drivers/gpu/drm/radeon/radeon_kms.c | 4 +++- 3 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c index 35db79a168bf7..df1a7eb736517 100644 --- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -635,8 +635,10 @@ radeon_crtc_set_config(struct drm_mode_set *set, dev = set->crtc->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_crtc_helper_set_config(set, ctx); diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 59f8186a24151..0cf0b00a0623e 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -549,8 +549,10 @@ long radeon_drm_ioctl(struct file *filp, long ret; dev = file_priv->minor->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_ioctl(filp, cmd, arg); diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index 58176db85952c..779e4cd86245d 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -638,8 +638,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv) file_priv->driver_priv = NULL; r = pm_runtime_get_sync(dev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(dev->dev); return r; + } /* new gpu have virtual address space support */ if (rdev->family >= CHIP_CAYMAN) { -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.8 37/64] drm/amdgpu/display: properly guard the calls to swSMU functions
From: Alex Deucher [ Upstream commit 4072327a2622af8688b88f5cd0a472136d3bf33d ] It's only applicable on newer asics. We could end up here when using DC on older asics like SI or KV. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1170 Reviewed-by: Evan Quan Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c index 7cee8070cb113..5c6a6ae48d396 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c @@ -106,7 +106,7 @@ bool dm_pp_apply_display_requirements( adev->powerplay.pp_funcs->display_configuration_change( adev->powerplay.pp_handle, >pm.pm_display_cfg); - else + else if (adev->smu.ppt_funcs) smu_display_configuration_change(smu, >pm.pm_display_cfg); @@ -592,7 +592,7 @@ void pp_rv_set_wm_ranges(struct pp_smu *pp, if (pp_funcs && pp_funcs->set_watermarks_for_clocks_ranges) pp_funcs->set_watermarks_for_clocks_ranges(pp_handle, _with_clock_ranges); - else + else if (adev->smu.ppt_funcs) smu_set_watermarks_for_clock_ranges(>smu, _with_clock_ranges); } -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.8 36/64] drm/amdgpu/display bail early in dm_pp_get_static_clocks
From: Alex Deucher [ Upstream commit 376814f5fcf1aadda501d1413d56e8af85d19a97 ] If there are no supported callbacks. We'll fall back to the nominal clocks. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1170 Reviewed-by: Evan Quan Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c index a2e1a73f66b81..7cee8070cb113 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c @@ -530,6 +530,8 @@ bool dm_pp_get_static_clocks( _clk_info); else if (adev->smu.ppt_funcs) ret = smu_get_current_clocks(>smu, _clk_info); + else + return false; if (ret) return false; -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.8 34/64] drm/amd/display: Improve DisplayPort monitor interop
From: Aric Cyr [ Upstream commit eec3303de3378cdfaa0bb86f43546dbbd88f94e2 ] [Why] DC is very fast at link training and stream enablement which causes issues such as blackscreens for non-compliant monitors. [How] After debugging with scaler vendors we implement the minimum delays at the necessary locations to ensure the monitor does not hang. Delays are generic due to lack of IEEE OUI information on the failing displays. Signed-off-by: Aric Cyr Reviewed-by: Wenjing Liu Acked-by: Qingqing Zhuo Acked-by: Tony Cheng Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/display/dc/core/dc_link.c| 4 +++- drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 16 ++-- .../amd/display/dc/dce110/dce110_hw_sequencer.c | 11 ++- 3 files changed, 23 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c index 48ab51533d5d6..841cc051b7d01 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c @@ -3298,9 +3298,11 @@ void core_link_disable_stream(struct pipe_ctx *pipe_ctx) write_i2c_redriver_setting(pipe_ctx, false); } } - dc->hwss.disable_stream(pipe_ctx); disable_link(pipe_ctx->stream->link, pipe_ctx->stream->signal); + + dc->hwss.disable_stream(pipe_ctx); + if (pipe_ctx->stream->timing.flags.DSC) { if (dc_is_dp_signal(pipe_ctx->stream->signal)) dp_set_dsc_enable(pipe_ctx, false); diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c index 91cd884d6f257..6124af571bff6 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c @@ -1102,6 +1102,10 @@ static inline enum link_training_result perform_link_training_int( dpcd_pattern.v1_4.TRAINING_PATTERN_SET = DPCD_TRAINING_PATTERN_VIDEOIDLE; dpcd_set_training_pattern(link, dpcd_pattern); + /* delay 5ms after notifying sink of idle pattern before switching output */ + if (link->connector_signal != SIGNAL_TYPE_EDP) + msleep(5); + /* 4. mainlink output idle pattern*/ dp_set_hw_test_pattern(link, DP_TEST_PATTERN_VIDEO_MODE, NULL, 0); @@ -1551,6 +1555,12 @@ bool perform_link_training_with_retries( struct dc_link *link = stream->link; enum dp_panel_mode panel_mode = dp_get_panel_mode(link); + /* We need to do this before the link training to ensure the idle pattern in SST +* mode will be sent right after the link training +*/ + link->link_enc->funcs->connect_dig_be_to_fe(link->link_enc, + pipe_ctx->stream_res.stream_enc->id, true); + for (j = 0; j < attempts; ++j) { dp_enable_link_phy( @@ -1567,12 +1577,6 @@ bool perform_link_training_with_retries( dp_set_panel_mode(link, panel_mode); - /* We need to do this before the link training to ensure the idle pattern in SST -* mode will be sent right after the link training -*/ - link->link_enc->funcs->connect_dig_be_to_fe(link->link_enc, - pipe_ctx->stream_res.stream_enc->id, true); - if (link->aux_access_disabled) { dc_link_dp_perform_link_training_skip_aux(link, link_setting); return true; diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c index b77e9dc160863..2af1d74d16ad8 100644 --- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c +++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c @@ -1069,8 +1069,17 @@ void dce110_blank_stream(struct pipe_ctx *pipe_ctx) link->dc->hwss.set_abm_immediate_disable(pipe_ctx); } - if (dc_is_dp_signal(pipe_ctx->stream->signal)) + if (dc_is_dp_signal(pipe_ctx->stream->signal)) { pipe_ctx->stream_res.stream_enc->funcs->dp_blank(pipe_ctx->stream_res.stream_enc); + + /* +* After output is idle pattern some sinks need time to recognize the stream +* has changed or they enter protection state and hang. +*/ + if (!dc_is_embedded_signal(pipe_ctx->stream->signal)) + msleep(60); + } + } -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.8 38/64] drm/amd/display: allow query ddc data over aux to be read only operation
From: Wenjing Liu [ Upstream commit 26b4750d6cf84cb2b3f0a84c9b345e7b71886410 ] [why] Two issues: 1. Add read only operation support for query ddc data over aux. 2. Fix a bug where if read size is multiple of 16, mot of the last read transaction will not be set to 0. Signed-off-by: Wenjing Liu Reviewed-by: Jun Lei Acked-by: Rodrigo Siqueira Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- .../gpu/drm/amd/display/dc/core/dc_link_ddc.c | 29 --- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c index aefd29a440b52..be8f265976b09 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c @@ -503,7 +503,7 @@ bool dal_ddc_service_query_ddc_data( uint8_t *read_buf, uint32_t read_size) { - bool ret = false; + bool success = true; uint32_t payload_size = dal_ddc_service_is_in_aux_transaction_mode(ddc) ? DEFAULT_AUX_MAX_DATA_SIZE : EDID_SEGMENT_SIZE; @@ -527,7 +527,6 @@ bool dal_ddc_service_query_ddc_data( * but we want to read 256 over i2c*/ if (dal_ddc_service_is_in_aux_transaction_mode(ddc)) { struct aux_payload payload; - bool read_available = true; payload.i2c_over_aux = true; payload.address = address; @@ -536,21 +535,26 @@ bool dal_ddc_service_query_ddc_data( if (write_size != 0) { payload.write = true; - payload.mot = false; + /* should not set mot (middle of transaction) to 0 +* if there are pending read payloads +*/ + payload.mot = read_size == 0 ? false : true; payload.length = write_size; payload.data = write_buf; - ret = dal_ddc_submit_aux_command(ddc, ); - read_available = ret; + success = dal_ddc_submit_aux_command(ddc, ); } - if (read_size != 0 && read_available) { + if (read_size != 0 && success) { payload.write = false; + /* should set mot (middle of transaction) to 0 +* since it is the last payload to send +*/ payload.mot = false; payload.length = read_size; payload.data = read_buf; - ret = dal_ddc_submit_aux_command(ddc, ); + success = dal_ddc_submit_aux_command(ddc, ); } } else { struct i2c_command command = {0}; @@ -573,7 +577,7 @@ bool dal_ddc_service_query_ddc_data( command.number_of_payloads = dal_ddc_i2c_payloads_get_count(); - ret = dm_helpers_submit_i2c( + success = dm_helpers_submit_i2c( ddc->ctx, ddc->link, ); @@ -581,7 +585,7 @@ bool dal_ddc_service_query_ddc_data( dal_ddc_i2c_payloads_destroy(); } - return ret; + return success; } bool dal_ddc_submit_aux_command(struct ddc_service *ddc, @@ -598,7 +602,7 @@ bool dal_ddc_submit_aux_command(struct ddc_service *ddc, do { struct aux_payload current_payload; - bool is_end_of_payload = (retrieved + DEFAULT_AUX_MAX_DATA_SIZE) > + bool is_end_of_payload = (retrieved + DEFAULT_AUX_MAX_DATA_SIZE) >= payload->length; current_payload.address = payload->address; @@ -607,7 +611,10 @@ bool dal_ddc_submit_aux_command(struct ddc_service *ddc, current_payload.i2c_over_aux = payload->i2c_over_aux; current_payload.length = is_end_of_payload ? payload->length - retrieved : DEFAULT_AUX_MAX_DATA_SIZE; - current_payload.mot = !is_end_of_payload; + /* set mot (middle of transaction) to false +* if it is the last payload +*/ + current_payload.mot = is_end_of_payload ? payload->mot:true; current_payload.reply = payload->reply; current_payload.write = payload->write; -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.8 48/64] drm/amd/powerplay: suppress compile error around BUG_ON
From: Evan Quan [ Upstream commit 75bc07e2403caea9ecac69f766dfb7dc33547594 ] To suppress the compile error below for "ARCH=arc". drivers/gpu/drm/amd/amdgpu/../powerplay/arcturus_ppt.c: In function 'arcturus_fill_eeprom_i2c_req': >> arch/arc/include/asm/bug.h:22:2: error: implicit declaration of function >> 'pr_warn'; did you mean 'pci_warn'? [-Werror=implicit-function-declaration] 22 | pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ | ^~~ include/asm-generic/bug.h:62:57: note: in expansion of macro 'BUG' 62 | #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0) | ^~~ drivers/gpu/drm/amd/amdgpu/../powerplay/arcturus_ppt.c:2157:2: note: in expansion of macro 'BUG_ON' 2157 | BUG_ON(numbytes > MAX_SW_I2C_COMMANDS); Signed-off-by: Evan Quan Acked-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c index 27c5fc9572b27..e4630a76d7bf5 100644 --- a/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c +++ b/drivers/gpu/drm/amd/powerplay/arcturus_ppt.c @@ -2042,8 +2042,6 @@ static void arcturus_fill_eeprom_i2c_req(SwI2cRequest_t *req, bool write, { int i; - BUG_ON(numbytes > MAX_SW_I2C_COMMANDS); - req->I2CcontrollerPort = 0; req->I2CSpeed = 2; req->SlaveAddress = address; @@ -2081,6 +2079,12 @@ static int arcturus_i2c_eeprom_read_data(struct i2c_adapter *control, struct smu_table_context *smu_table = >smu.smu_table; struct smu_table *table = _table->driver_table; + if (numbytes > MAX_SW_I2C_COMMANDS) { + dev_err(adev->dev, "numbytes requested %d is over max allowed %d\n", + numbytes, MAX_SW_I2C_COMMANDS); + return -EINVAL; + } + memset(, 0, sizeof(req)); arcturus_fill_eeprom_i2c_req(, false, address, numbytes, data); @@ -2117,6 +2121,12 @@ static int arcturus_i2c_eeprom_write_data(struct i2c_adapter *control, SwI2cRequest_t req; struct amdgpu_device *adev = to_amdgpu_device(control); + if (numbytes > MAX_SW_I2C_COMMANDS) { + dev_err(adev->dev, "numbytes requested %d is over max allowed %d\n", + numbytes, MAX_SW_I2C_COMMANDS); + return -EINVAL; + } + memset(, 0, sizeof(req)); arcturus_fill_eeprom_i2c_req(, true, address, numbytes, data); -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.8 39/64] drm/amd/powerplay: fix compile error with ARCH=arc
From: Evan Quan [ Upstream commit 9822ba2ead1baa3de4860ad9472f652c4cc78c9c ] Fix the compile error below: drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c: In function 'smu_v11_0_init_microcode': >> arch/arc/include/asm/bug.h:22:2: error: implicit declaration of function >> 'pr_warn'; did you mean 'pci_warn'? [-Werror=implicit-function-declaration] 22 | pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ | ^~~ drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:176:3: note: in expansion of macro 'BUG' 176 | BUG(); Reported-by: kernel test robot Signed-off-by: Evan Quan Acked-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c index aa76c2cea7471..7897be877b961 100644 --- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c +++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c @@ -164,7 +164,8 @@ int smu_v11_0_init_microcode(struct smu_context *smu) chip_name = "navi12"; break; default: - BUG(); + dev_err(adev->dev, "Unsupported ASIC type %d\n", adev->asic_type); + return -EINVAL; } snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_smc.bin", chip_name); -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.8 35/64] drm/amdgpu/debugfs: fix ref count leak when pm_runtime_get_sync fails
From: Alex Deucher [ Upstream commit 9eee152aab56d374edb9ad21b3db05f5cdda2fe6 ] The call to pm_runtime_get_sync increments the counter even in case of failure, leading to incorrect ref count. In case of failure, decrement the ref count before returning. Acked-by: Evan Quan Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 95 +++-- 1 file changed, 70 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index a414da22a359c..386b979e08522 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -223,12 +223,16 @@ static int amdgpu_debugfs_process_reg_op(bool read, struct file *f, *pos &= (1UL << 22) - 1; r = pm_runtime_get_sync(adev->ddev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } r = amdgpu_virt_enable_access_debugfs(adev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } if (use_bank) { if ((sh_bank != 0x && sh_bank >= adev->gfx.config.max_sh_per_se) || @@ -332,12 +336,16 @@ static ssize_t amdgpu_debugfs_regs_pcie_read(struct file *f, char __user *buf, return -EINVAL; r = pm_runtime_get_sync(adev->ddev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } r = amdgpu_virt_enable_access_debugfs(adev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } while (size) { uint32_t value; @@ -387,12 +395,16 @@ static ssize_t amdgpu_debugfs_regs_pcie_write(struct file *f, const char __user return -EINVAL; r = pm_runtime_get_sync(adev->ddev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } r = amdgpu_virt_enable_access_debugfs(adev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } while (size) { uint32_t value; @@ -443,12 +455,16 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file *f, char __user *buf, return -EINVAL; r = pm_runtime_get_sync(adev->ddev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } r = amdgpu_virt_enable_access_debugfs(adev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } while (size) { uint32_t value; @@ -498,12 +514,16 @@ static ssize_t amdgpu_debugfs_regs_didt_write(struct file *f, const char __user return -EINVAL; r = pm_runtime_get_sync(adev->ddev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } r = amdgpu_virt_enable_access_debugfs(adev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } while (size) { uint32_t value; @@ -554,12 +574,16 @@ static ssize_t amdgpu_debugfs_regs_smc_read(struct file *f, char __user *buf, return -EINVAL; r = pm_runtime_get_sync(adev->ddev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } r = amdgpu_virt_enable_access_debugfs(adev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } while (size) { uint32_t value; @@ -609,12 +633,16 @@ static ssize_t amdgpu_debugfs_regs_smc_write(struct file *f, const char __user * return -EINVAL; r = pm_runtime_get_sync(adev->ddev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } r = amdgpu_virt_enable_access_debugfs(adev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(adev->ddev->dev); return r; + } while (size) { uint32_t value; @@ -764,12 +792,16 @@ static ssize_t amdgpu_debugfs_sensor_read(struct file *f, char __user *buf, valuesize = sizeof(values); r = pm_runtime_get_sync(adev->ddev->dev); - if (r < 0) + if (r < 0) { +
[PATCH AUTOSEL 5.8 10/64] drm/radeon: Fix reference count leaks caused by pm_runtime_get_sync
From: Aditya Pakki [ Upstream commit 9fb10671011143d15b6b40d6d5fa9c52c57e9d63 ] On calling pm_runtime_get_sync() the reference count of the device is incremented. In case of failure, decrement the reference count before returning the error. Acked-by: Evan Quan Signed-off-by: Aditya Pakki Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/radeon_display.c | 4 +++- drivers/gpu/drm/radeon/radeon_drv.c | 4 +++- drivers/gpu/drm/radeon/radeon_kms.c | 4 +++- 3 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c index 35db79a168bf7..df1a7eb736517 100644 --- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -635,8 +635,10 @@ radeon_crtc_set_config(struct drm_mode_set *set, dev = set->crtc->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_crtc_helper_set_config(set, ctx); diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index bbb0883e8ce6a..62b5069122cc4 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -549,8 +549,10 @@ long radeon_drm_ioctl(struct file *filp, long ret; dev = file_priv->minor->dev; ret = pm_runtime_get_sync(dev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_autosuspend(dev->dev); return ret; + } ret = drm_ioctl(filp, cmd, arg); diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index c5d1dc9618a40..99ee60f8b604d 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -638,8 +638,10 @@ int radeon_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv) file_priv->driver_priv = NULL; r = pm_runtime_get_sync(dev->dev); - if (r < 0) + if (r < 0) { + pm_runtime_put_autosuspend(dev->dev); return r; + } /* new gpu have virtual address space support */ if (rdev->family >= CHIP_CAYMAN) { -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.8 30/64] drm/radeon: disable AGP by default
From: Christian König [ Upstream commit ba806f98f868ce107aa9c453fef751de9980e4af ] Always use the PCI GART instead. We just have to many cases where AGP still causes problems. This means a performance regression for some GPUs, but also a bug fix for some others. Signed-off-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/radeon_drv.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 62b5069122cc4..4cd30613fa1dd 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -171,12 +171,7 @@ int radeon_no_wb; int radeon_modeset = -1; int radeon_dynclks = -1; int radeon_r4xx_atom = 0; -#ifdef __powerpc__ -/* Default to PCI on PowerPC (fdo #95017) */ int radeon_agpmode = -1; -#else -int radeon_agpmode = 0; -#endif int radeon_vram_limit = 0; int radeon_gart_size = -1; /* auto */ int radeon_benchmarking = 0; -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH AUTOSEL 5.8 09/64] drm/amdgpu: avoid dereferencing a NULL pointer
From: Jack Xiao [ Upstream commit 55611b507fd6453d26030c0c0619fdf0c262766d ] Check if irq_src is NULL to avoid dereferencing a NULL pointer, for MES ring is uneccessary to recieve an interrupt notification. Signed-off-by: Jack Xiao Acked-by: Alex Deucher Reviewed-by: Hawking Zhang Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index d878fe7fee51c..3414e119f0cbf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -416,7 +416,9 @@ int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring, ring->fence_drv.gpu_addr = adev->uvd.inst[ring->me].gpu_addr + index; } amdgpu_fence_write(ring, atomic_read(>fence_drv.last_seq)); - amdgpu_irq_get(adev, irq_src, irq_type); + + if (irq_src) + amdgpu_irq_get(adev, irq_src, irq_type); ring->fence_drv.irq_src = irq_src; ring->fence_drv.irq_type = irq_type; @@ -537,8 +539,9 @@ void amdgpu_fence_driver_fini(struct amdgpu_device *adev) /* no need to trigger GPU reset as we are unloading */ amdgpu_fence_driver_force_completion(ring); } - amdgpu_irq_put(adev, ring->fence_drv.irq_src, - ring->fence_drv.irq_type); + if (ring->fence_drv.irq_src) + amdgpu_irq_put(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); drm_sched_fini(>sched); del_timer_sync(>fence_drv.fallback_timer); for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j) @@ -574,8 +577,9 @@ void amdgpu_fence_driver_suspend(struct amdgpu_device *adev) } /* disable the interrupt */ - amdgpu_irq_put(adev, ring->fence_drv.irq_src, - ring->fence_drv.irq_type); + if (ring->fence_drv.irq_src) + amdgpu_irq_put(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); } } @@ -601,8 +605,9 @@ void amdgpu_fence_driver_resume(struct amdgpu_device *adev) continue; /* enable the interrupt */ - amdgpu_irq_get(adev, ring->fence_drv.irq_src, - ring->fence_drv.irq_type); + if (ring->fence_drv.irq_src) + amdgpu_irq_get(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); } } -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm: amdgpu: Use the correct size when allocating memory
Le 10/08/2020 à 17:42, Dan Carpenter a écrit : On Sun, Aug 09, 2020 at 10:34:06PM +0200, Christophe JAILLET wrote: When '*sgt' is allocated, we must allocated 'sizeof(**sgt)' bytes instead of 'sizeof(*sg)'. 'sg' (i.e. struct scatterlist) is smaller than 'sgt' (i.e struct sg_table), so this could lead to memory corruption. The sizeof(*sg) is bigger than sizeof(**sgt) so this wastes memory but it won't lead to corruption. 11 struct scatterlist { 12 unsigned long page_link; 13 unsigned intoffset; 14 unsigned intlength; 15 dma_addr_t dma_address; 16 #ifdef CONFIG_NEED_SG_DMA_LENGTH 17 unsigned intdma_length; 18 #endif 19 }; 42 struct sg_table { 43 struct scatterlist *sgl;/* the list */ 44 unsigned int nents; /* number of mapped entries */ 45 unsigned int orig_nents;/* original size of list */ 46 }; regards, dan carpenter My bad. I read 'struct scatterlist sgl' (without the *) Thanks for the follow-up, Dan. Doesn't smatch catch such mismatch? (I've not run smatch for a while, so it is maybe reported) Well, the proposal is still valid, even if it has less impact as initially thought. Thx for the review. CJ ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu/display: drop unused function
Acked-by: Nirmoy Das On 8/10/20 5:56 PM, Alex Deucher wrote: This is not longer used as of the latest rework of this code so drop it to avoid a unused function warning. Signed-off-by: Alex Deucher --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 18 -- 1 file changed, 18 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 0f9d87773fa3..ff5f7f7ceec6 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -2827,24 +2827,6 @@ dm_atomic_get_new_state(struct drm_atomic_state *state) return NULL; } -static struct dm_atomic_state * -dm_atomic_get_old_state(struct drm_atomic_state *state) -{ - struct drm_device *dev = state->dev; - struct amdgpu_device *adev = dev->dev_private; - struct amdgpu_display_manager *dm = >dm; - struct drm_private_obj *obj; - struct drm_private_state *old_obj_state; - int i; - - for_each_old_private_obj_in_state(state, obj, old_obj_state, i) { - if (obj->funcs == dm->atomic_obj.funcs) - return to_dm_atomic_state(old_obj_state); - } - - return NULL; -} - static struct drm_private_state * dm_atomic_duplicate_state(struct drm_private_obj *obj) { ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu/display: drop unused function
This is not longer used as of the latest rework of this code so drop it to avoid a unused function warning. Signed-off-by: Alex Deucher --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 18 -- 1 file changed, 18 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 0f9d87773fa3..ff5f7f7ceec6 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -2827,24 +2827,6 @@ dm_atomic_get_new_state(struct drm_atomic_state *state) return NULL; } -static struct dm_atomic_state * -dm_atomic_get_old_state(struct drm_atomic_state *state) -{ - struct drm_device *dev = state->dev; - struct amdgpu_device *adev = dev->dev_private; - struct amdgpu_display_manager *dm = >dm; - struct drm_private_obj *obj; - struct drm_private_state *old_obj_state; - int i; - - for_each_old_private_obj_in_state(state, obj, old_obj_state, i) { - if (obj->funcs == dm->atomic_obj.funcs) - return to_dm_atomic_state(old_obj_state); - } - - return NULL; -} - static struct drm_private_state * dm_atomic_duplicate_state(struct drm_private_obj *obj) { -- 2.25.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/1] drm/amdgpu/display: use GFP_ATOMIC in dcn20_validate_bandwidth_internal
On Mon, Aug 10, 2020, at 17:38, Alex Deucher wrote: > On Sat, Aug 8, 2020 at 4:51 PM Daniel Kolesa wrote: > > > > GFP_KERNEL may and will sleep, and this is being executed in > > a non-preemptible context; this will mess things up since it's > > called inbetween DC_FP_START/END, and rescheduling will result > > in the DC_FP_END later being called in a different context (or > > just crashing if any floating point/vector registers/instructions > > are used after the call is resumed in a different context). > > > > Signed-off-by: Daniel Kolesa > > We should probably find a way to pre-allocate this, but in the > meantime, I'll apply the patch. Indeed. But this should work as an immediate solution for people experiencing issues. For me it completely stopped dmesg noise on ppc64le and aarch64, and seemingly even fixed the screen flicker problem on POWER that I had to formerly work around by forcing the clocks to high (been running it for days and it's worked perfectly fine with adaptive during the time). Daniel > > Thanks! > > Alex > > > > --- > > drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > > b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > > index 991eddd10952..c31d1f30e505 100644 > > --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > > +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > > @@ -3141,7 +3141,7 @@ static bool dcn20_validate_bandwidth_internal(struct > > dc *dc, struct dc_state *co > > int vlevel = 0; > > int pipe_split_from[MAX_PIPES]; > > int pipe_cnt = 0; > > - display_e2e_pipe_params_st *pipes = > > kzalloc(dc->res_pool->pipe_count * sizeof(display_e2e_pipe_params_st), > > GFP_KERNEL); > > + display_e2e_pipe_params_st *pipes = > > kzalloc(dc->res_pool->pipe_count * sizeof(display_e2e_pipe_params_st), > > GFP_ATOMIC); > > DC_LOGGER_INIT(dc->ctx->logger); > > > > BW_VAL_TRACE_COUNT(); > > -- > > 2.28.0 > > > > ___ > > amd-gfx mailing list > > amd-gfx@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm: amdgpu: Use the correct size when allocating memory
On Sun, Aug 09, 2020 at 10:34:06PM +0200, Christophe JAILLET wrote: > When '*sgt' is allocated, we must allocated 'sizeof(**sgt)' bytes instead > of 'sizeof(*sg)'. 'sg' (i.e. struct scatterlist) is smaller than > 'sgt' (i.e struct sg_table), so this could lead to memory corruption. The sizeof(*sg) is bigger than sizeof(**sgt) so this wastes memory but it won't lead to corruption. 11 struct scatterlist { 12 unsigned long page_link; 13 unsigned intoffset; 14 unsigned intlength; 15 dma_addr_t dma_address; 16 #ifdef CONFIG_NEED_SG_DMA_LENGTH 17 unsigned intdma_length; 18 #endif 19 }; 42 struct sg_table { 43 struct scatterlist *sgl;/* the list */ 44 unsigned int nents; /* number of mapped entries */ 45 unsigned int orig_nents;/* original size of list */ 46 }; regards, dan carpenter ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/display: add DCN support for aarch64
Applied. Thanks! Alex On Sat, Aug 8, 2020 at 4:51 PM Daniel Kolesa wrote: > > This adds ARM64 support into the DCN. This mainly enables support > for Navi graphics cards. The dcn10 changes haven't been tested, > since I don't have the relevant hardware available, but there > is no way to conditionally disable them, so I've done them anyway. > > Signed-off-by: Daniel Kolesa > --- > drivers/gpu/drm/amd/display/Kconfig | 2 +- > drivers/gpu/drm/amd/display/dc/calcs/Makefile | 7 ++ > .../gpu/drm/amd/display/dc/clk_mgr/Makefile | 7 ++ > drivers/gpu/drm/amd/display/dc/dcn10/Makefile | 7 ++ > .../drm/amd/display/dc/dcn10/dcn10_resource.c | 81 --- > drivers/gpu/drm/amd/display/dc/dcn20/Makefile | 4 + > drivers/gpu/drm/amd/display/dc/dcn21/Makefile | 4 + > drivers/gpu/drm/amd/display/dc/dml/Makefile | 13 +++ > drivers/gpu/drm/amd/display/dc/dsc/Makefile | 5 ++ > drivers/gpu/drm/amd/display/dc/os_types.h | 4 + > 10 files changed, 102 insertions(+), 32 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/Kconfig > b/drivers/gpu/drm/amd/display/Kconfig > index 77569097a480..f24abf428534 100644 > --- a/drivers/gpu/drm/amd/display/Kconfig > +++ b/drivers/gpu/drm/amd/display/Kconfig > @@ -6,7 +6,7 @@ config DRM_AMD_DC > bool "AMD DC - Enable new display engine" > default y > select SND_HDA_COMPONENT if SND_HDA_CORE > - select DRM_AMD_DC_DCN if (X86 || PPC64) && !(KCOV_INSTRUMENT_ALL && > KCOV_ENABLE_COMPARISONS) > + select DRM_AMD_DC_DCN if (X86 || PPC64 || (ARM64 && > KERNEL_MODE_NEON)) && !(KCOV_INSTRUMENT_ALL && KCOV_ENABLE_COMPARISONS) > help > Choose this option if you want to use the new display engine > support for AMDGPU. This adds required support for Vega and > diff --git a/drivers/gpu/drm/amd/display/dc/calcs/Makefile > b/drivers/gpu/drm/amd/display/dc/calcs/Makefile > index 4674aca8f206..64f515d74410 100644 > --- a/drivers/gpu/drm/amd/display/dc/calcs/Makefile > +++ b/drivers/gpu/drm/amd/display/dc/calcs/Makefile > @@ -33,6 +33,10 @@ ifdef CONFIG_PPC64 > calcs_ccflags := -mhard-float -maltivec > endif > > +ifdef CONFIG_ARM64 > +calcs_rcflags := -mgeneral-regs-only > +endif > + > ifdef CONFIG_CC_IS_GCC > ifeq ($(call cc-ifversion, -lt, 0701, y), y) > IS_OLD_GCC = 1 > @@ -53,6 +57,9 @@ endif > CFLAGS_$(AMDDALPATH)/dc/calcs/dcn_calcs.o := $(calcs_ccflags) > CFLAGS_$(AMDDALPATH)/dc/calcs/dcn_calc_auto.o := $(calcs_ccflags) > CFLAGS_$(AMDDALPATH)/dc/calcs/dcn_calc_math.o := $(calcs_ccflags) > -Wno-tautological-compare > +CFLAGS_REMOVE_$(AMDDALPATH)/dc/calcs/dcn_calcs.o := $(calcs_rcflags) > +CFLAGS_REMOVE_$(AMDDALPATH)/dc/calcs/dcn_calc_auto.o := $(calcs_rcflags) > +CFLAGS_REMOVE_$(AMDDALPATH)/dc/calcs/dcn_calc_math.o := $(calcs_rcflags) > > BW_CALCS = dce_calcs.o bw_fixed.o custom_float.o > > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > index 52b1ce775a1e..1a495759a034 100644 > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile > @@ -104,6 +104,13 @@ ifdef CONFIG_PPC64 > CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn21/rn_clk_mgr.o := $(call > cc-option,-mno-gnu-attribute) > endif > > +# prevent build errors: > +# ...: '-mgeneral-regs-only' is incompatible with the use of floating-point > types > +# this file is unused on arm64, just like on ppc64 > +ifdef CONFIG_ARM64 > +CFLAGS_REMOVE_$(AMDDALPATH)/dc/clk_mgr/dcn21/rn_clk_mgr.o := > -mgeneral-regs-only > +endif > + > AMD_DAL_CLK_MGR_DCN21 = $(addprefix > $(AMDDALPATH)/dc/clk_mgr/dcn21/,$(CLK_MGR_DCN21)) > > AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN21) > diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/Makefile > b/drivers/gpu/drm/amd/display/dc/dcn10/Makefile > index 62ad1a11bff9..733e6e6e43bd 100644 > --- a/drivers/gpu/drm/amd/display/dc/dcn10/Makefile > +++ b/drivers/gpu/drm/amd/display/dc/dcn10/Makefile > @@ -31,4 +31,11 @@ DCN10 = dcn10_init.o dcn10_resource.o dcn10_ipp.o > dcn10_hw_sequencer.o \ > > AMD_DAL_DCN10 = $(addprefix $(AMDDALPATH)/dc/dcn10/,$(DCN10)) > > +# fix: > +# ...: '-mgeneral-regs-only' is incompatible with the use of floating-point > types > +# aarch64 does not support soft-float, so use hard-float and handle this in > code > +ifdef CONFIG_ARM64 > +CFLAGS_REMOVE_$(AMDDALPATH)/dc/dcn10/dcn10_resource.o := -mgeneral-regs-only > +endif > + > AMD_DISPLAY_FILES += $(AMD_DAL_DCN10) > diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c > b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c > index 17d5cb422025..07571f84e0f8 100644 > --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c > +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c > @@ -1331,6 +1331,47 @@ static uint32_t read_pipe_fuses(struct dc_context *ctx) > return value; > } > > +/* > + * Some architectures don't support soft-float (e.g. aarch64), on those > + * this
Re: [PATCH 1/1] drm/amdgpu/display: use GFP_ATOMIC in dcn20_validate_bandwidth_internal
On Sat, Aug 8, 2020 at 4:51 PM Daniel Kolesa wrote: > > GFP_KERNEL may and will sleep, and this is being executed in > a non-preemptible context; this will mess things up since it's > called inbetween DC_FP_START/END, and rescheduling will result > in the DC_FP_END later being called in a different context (or > just crashing if any floating point/vector registers/instructions > are used after the call is resumed in a different context). > > Signed-off-by: Daniel Kolesa We should probably find a way to pre-allocate this, but in the meantime, I'll apply the patch. Thanks! Alex > --- > drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > index 991eddd10952..c31d1f30e505 100644 > --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > @@ -3141,7 +3141,7 @@ static bool dcn20_validate_bandwidth_internal(struct dc > *dc, struct dc_state *co > int vlevel = 0; > int pipe_split_from[MAX_PIPES]; > int pipe_cnt = 0; > - display_e2e_pipe_params_st *pipes = kzalloc(dc->res_pool->pipe_count > * sizeof(display_e2e_pipe_params_st), GFP_KERNEL); > + display_e2e_pipe_params_st *pipes = kzalloc(dc->res_pool->pipe_count > * sizeof(display_e2e_pipe_params_st), GFP_ATOMIC); > DC_LOGGER_INIT(dc->ctx->logger); > > BW_VAL_TRACE_COUNT(); > -- > 2.28.0 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 6/8] drm/amd/display: Set DC options from modifiers.
On Mon, Aug 10, 2020 at 4:13 PM Bas Nieuwenhuizen wrote: > > On Mon, Aug 10, 2020 at 3:09 PM Daniel Vetter wrote: > > > > On Mon, Aug 10, 2020 at 02:49:00PM +0200, Michel Dänzer wrote: > > > On 2020-08-10 2:28 p.m., Daniel Vetter wrote: > > > > > > > > Ok just learned that amdgpu hat set/get_tiling, so I'm upgrading my idea > > > > here to a very strong recommendation, i.e. please do this except if > > > > there's and amd ddx which somehow wants to change tiling mode while a fb > > > > exists, and expects this to propagate. > > > > > > > > In i915 we even disallow the set_tiling ioctl with an error if an fb > > > > exists, just to make sure userspace behaves. Even if userspace uses > > > > set_tiling, this way we can at least enforce the same semantics of > > > > "client > > > > can't pull compositor over the table with a set_tiling at the wrong > > > > time" > > > > of modifiers. > > > > > > FWIW, xf86-video-amdgpu doesn't have any code to set the tiling > > > metadata, only Mesa and presumably AMD's Vulkan/OpenGL UMDs do. > > > > Ah right you do everything with glamour, so this should never show up as a > > problem. > > I think it is a good idea to do so, but cannot do it completely in > this series as we don't define modifiers for GFX6-GFX8 GPU generations > yet. (wanted to leave these out for a bit to reduce the scope for the > initial version) Hm right, that makes it a bit awkward. > That said, there is a series that captures the tiling flags on FB > creation: https://patchwork.freedesktop.org/series/80109/ Yeah, but it only pushes it down into the state objects. Good first step, what I'm proposing is to push it all the way into addfb/struct drm_framebuffer. Since drm_framebuffer is also an invariant thing it makes the most sense to keep that there. I'm also discussing with Nicholas about what would be the ideal end state. But yeah maybe one thing at time. -Daniel > > > -Daniel > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch > > ___ > > amd-gfx mailing list > > amd-gfx@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 6/8] drm/amd/display: Set DC options from modifiers.
On Mon, Aug 10, 2020 at 3:09 PM Daniel Vetter wrote: > > On Mon, Aug 10, 2020 at 02:49:00PM +0200, Michel Dänzer wrote: > > On 2020-08-10 2:28 p.m., Daniel Vetter wrote: > > > > > > Ok just learned that amdgpu hat set/get_tiling, so I'm upgrading my idea > > > here to a very strong recommendation, i.e. please do this except if > > > there's and amd ddx which somehow wants to change tiling mode while a fb > > > exists, and expects this to propagate. > > > > > > In i915 we even disallow the set_tiling ioctl with an error if an fb > > > exists, just to make sure userspace behaves. Even if userspace uses > > > set_tiling, this way we can at least enforce the same semantics of "client > > > can't pull compositor over the table with a set_tiling at the wrong time" > > > of modifiers. > > > > FWIW, xf86-video-amdgpu doesn't have any code to set the tiling > > metadata, only Mesa and presumably AMD's Vulkan/OpenGL UMDs do. > > Ah right you do everything with glamour, so this should never show up as a > problem. I think it is a good idea to do so, but cannot do it completely in this series as we don't define modifiers for GFX6-GFX8 GPU generations yet. (wanted to leave these out for a bit to reduce the scope for the initial version) That said, there is a series that captures the tiling flags on FB creation: https://patchwork.freedesktop.org/series/80109/ > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 6/8] drm/amd/display: Set DC options from modifiers.
On Mon, Aug 10, 2020 at 02:49:00PM +0200, Michel Dänzer wrote: > On 2020-08-10 2:28 p.m., Daniel Vetter wrote: > > > > Ok just learned that amdgpu hat set/get_tiling, so I'm upgrading my idea > > here to a very strong recommendation, i.e. please do this except if > > there's and amd ddx which somehow wants to change tiling mode while a fb > > exists, and expects this to propagate. > > > > In i915 we even disallow the set_tiling ioctl with an error if an fb > > exists, just to make sure userspace behaves. Even if userspace uses > > set_tiling, this way we can at least enforce the same semantics of "client > > can't pull compositor over the table with a set_tiling at the wrong time" > > of modifiers. > > FWIW, xf86-video-amdgpu doesn't have any code to set the tiling > metadata, only Mesa and presumably AMD's Vulkan/OpenGL UMDs do. Ah right you do everything with glamour, so this should never show up as a problem. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] dma-buf.rst: repair length of title underline
On Mon, Aug 10, 2020 at 01:25:40PM +0200, Christian König wrote: > Am 09.08.20 um 08:17 schrieb Lukas Bulwahn: > > With commit 72b6ede73623 ("dma-buf.rst: Document why indefinite fences are > > a bad idea"), document generation warns: > > > >Documentation/driver-api/dma-buf.rst:182: \ > >WARNING: Title underline too short. > > > > Repair length of title underline to remove warning. > > > > Fixes: 72b6ede73623 ("dma-buf.rst: Document why indefinite fences are a bad > > idea") > > Signed-off-by: Lukas Bulwahn > > Acked-by: Christian König > > Should I pick it up into drm-misc-next? Yes please. For the future if you need to check if someone has commit rights and can push themselves: https://people.freedesktop.org/~seanpaul/whomisc.html Yeah with gitlab this would all be a bit more reasonable, but we get by meanwhile :-) Cheers, Daniel > > > --- > > Daniel, please pick this minor non-urgent fix to your new documentation. > > > > Documentation/driver-api/dma-buf.rst | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/Documentation/driver-api/dma-buf.rst > > b/Documentation/driver-api/dma-buf.rst > > index 100bfd227265..13ea0cc0a3fa 100644 > > --- a/Documentation/driver-api/dma-buf.rst > > +++ b/Documentation/driver-api/dma-buf.rst > > @@ -179,7 +179,7 @@ DMA Fence uABI/Sync File > > :internal: > > Indefinite DMA Fences > > - > > +~ > > At various times _fence with an indefinite time until dma_fence_wait() > > finishes have been proposed. Examples include: > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amd/display: convert to use le16_add_cpu()
Convert cpu_to_le16(le16_to_cpu(E1) + E2) to use le16_add_cpu(). Signed-off-by: Qinglang Miao --- drivers/gpu/drm/amd/display/dc/bios/command_table.c | 4 +--- drivers/gpu/drm/amd/display/dc/bios/command_table2.c | 5 + 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table.c b/drivers/gpu/drm/amd/display/dc/bios/command_table.c index 5815983ca..070459e3e 100644 --- a/drivers/gpu/drm/amd/display/dc/bios/command_table.c +++ b/drivers/gpu/drm/amd/display/dc/bios/command_table.c @@ -1877,9 +1877,7 @@ static enum bp_result set_crtc_using_dtd_timing_v3( * but it is 4 either from Edid data (spec CEA 861) * or CEA timing table. */ - params.usV_SyncOffset = - cpu_to_le16(le16_to_cpu(params.usV_SyncOffset) + 1); - + le16_add_cpu(_SyncOffset, 1); } } diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c index bed91572f..e8f52eb8e 100644 --- a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c +++ b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c @@ -569,10 +569,7 @@ static enum bp_result set_crtc_using_dtd_timing_v3( * but it is 4 either from Edid data (spec CEA 861) * or CEA timing table. */ - params.v_syncoffset = - cpu_to_le16(le16_to_cpu(params.v_syncoffset) + - 1); - + le16_add_cpu(_syncoffset, 1); } } -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 6/8] drm/amd/display: Set DC options from modifiers.
On 2020-08-10 2:28 p.m., Daniel Vetter wrote: > > Ok just learned that amdgpu hat set/get_tiling, so I'm upgrading my idea > here to a very strong recommendation, i.e. please do this except if > there's and amd ddx which somehow wants to change tiling mode while a fb > exists, and expects this to propagate. > > In i915 we even disallow the set_tiling ioctl with an error if an fb > exists, just to make sure userspace behaves. Even if userspace uses > set_tiling, this way we can at least enforce the same semantics of "client > can't pull compositor over the table with a set_tiling at the wrong time" > of modifiers. FWIW, xf86-video-amdgpu doesn't have any code to set the tiling metadata, only Mesa and presumably AMD's Vulkan/OpenGL UMDs do. -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 7/7] drm/amd/display: Replace DRM private objects with subclassed DRM atomic state
On Fri, Aug 07, 2020 at 10:32:11AM -0400, Kazlauskas, Nicholas wrote: > On 2020-08-07 4:52 a.m., dan...@ffwll.ch wrote: > > On Thu, Jul 30, 2020 at 04:36:42PM -0400, Nicholas Kazlauskas wrote: > > > @@ -440,7 +431,7 @@ struct dm_crtc_state { > > > #define to_dm_crtc_state(x) container_of(x, struct dm_crtc_state, base) > > > struct dm_atomic_state { > > > - struct drm_private_state base; > > > + struct drm_atomic_state base; > > > struct dc_state *context; > > > > Also curiosity: Can't we just embed dc_state here, instead of a pointer? > > Then it would become a lot more obvious that mostly this is a state object > > container like drm_atomic_state, but for the DC specific state structures. > > And we could look into moving the actual DC states into drm private states > > perhaps (if that helps with the code structure and overall flow). > > > > Maybe as next steps. > > -Daniel > > > > It's the refcounting that's the problem with this stuff. I'd like to move DC > to a model where we have no memory allocation/ownership but that might be a > bit of a more long term plan at this point. > > Same with dc_plane_state and dc_stream_state as well - these could exist on > the DRM objects as long as they're not refcounted. Hm what's the refcounting problem you're having? Or is it the lack of refcounting, and dc having different ideas about lifetimes than atomic? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 3/7] drm/amd/display: Avoid using unvalidated tiling_flags and tmz_surface in prepare_planes
On Mon, Aug 10, 2020 at 02:30:05PM +0200, Christian König wrote: > Am 10.08.20 um 14:25 schrieb Daniel Vetter: > > On Fri, Aug 07, 2020 at 10:29:09AM -0400, Kazlauskas, Nicholas wrote: > > > On 2020-08-07 4:30 a.m., dan...@ffwll.ch wrote: > > > > On Thu, Jul 30, 2020 at 04:36:38PM -0400, Nicholas Kazlauskas wrote: > > > > > [Why] > > > > > We're racing with userspace as the flags could potentially change > > > > > from when we acquired and validated them in commit_check. > > > > Uh ... I didn't know these could change. I think my comments on Bas' > > > > series are even more relevant now. I think long term would be best to > > > > bake > > > > these flags in at addfb time when modifiers aren't set. And otherwise > > > > always use the modifiers flag, and completely ignore the legacy flags > > > > here. > > > > -Daniel > > > > > > > There's a set tiling/mod flags IOCTL that can be called after addfb > > > happens, > > > so unless there's some sort of driver magic preventing this from working > > > when it's already been pinned for scanout then I don't see anything > > > stopping > > > this from happening. > > > > > > I still need to review the modifiers series in a little more detail but > > > that > > > looks like a good approach to fixing these kind of issues. > > Yeah we had the same model for i915, but it's awkward and can surprise > > compositors (since the client could change the tiling mode from underneath > > the compositor). So freezing the tiling mode at addfb time is the right > > thing to do, and anyway how things work with modifiers. > > > > Ofc maybe good to audit the -amd driver, but hopefully it doesn't do > > anything silly with changed tiling. If it does, it's kinda sad day. > > Marek should know this right away, but I think we only set the tilling flags > once while exporting the BO and then never change them. Yeah it's the only reasonable model really, since set/get_tiling is not synchronized with any winsys protocol. So full of races by design. The only thing I'd worry about is if you do set_tiling after addfb in your ddx. That one is save, but would badly break if we sample modifiers from set_tiling flags only once at addfb time. -Daniel > > Regards, > Christian. > > > > > Btw for i915 we even went a step further, and made the set_tiling ioctl > > return an error if a framebuffer for that gem_bo existed. Just to make > > sure we don't ever accidentally break this. > > > > Cheers, Daniel > > > > > Regards, > > > Nicholas Kazlauskas > > > > > > > > [How] > > > > > We unfortunately can't drop this function in its entirety from > > > > > prepare_planes since we don't know the afb->address at commit_check > > > > > time yet. > > > > > > > > > > So instead of querying new tiling_flags and tmz_surface use the ones > > > > > from the plane_state directly. > > > > > > > > > > While we're at it, also update the force_disable_dcc option based > > > > > on the state from atomic check. > > > > > > > > > > Cc: Bhawanpreet Lakha > > > > > Cc: Rodrigo Siqueira > > > > > Signed-off-by: Nicholas Kazlauskas > > > > > --- > > > > >.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 > > > > > ++- > > > > >1 file changed, 19 insertions(+), 17 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > > > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > > > > index bf1881bd492c..f78c09c9585e 100644 > > > > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > > > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > > > > @@ -5794,14 +5794,8 @@ static int dm_plane_helper_prepare_fb(struct > > > > > drm_plane *plane, > > > > > struct list_head list; > > > > > struct ttm_validate_buffer tv; > > > > > struct ww_acquire_ctx ticket; > > > > > - uint64_t tiling_flags; > > > > > uint32_t domain; > > > > > int r; > > > > > - bool tmz_surface = false; > > > > > - bool force_disable_dcc = false; > > > > > - > > > > > - dm_plane_state_old = to_dm_plane_state(plane->state); > > > > > - dm_plane_state_new = to_dm_plane_state(new_state); > > > > > if (!new_state->fb) { > > > > > DRM_DEBUG_DRIVER("No FB bound\n"); > > > > > @@ -5845,27 +5839,35 @@ static int dm_plane_helper_prepare_fb(struct > > > > > drm_plane *plane, > > > > > return r; > > > > > } > > > > > - amdgpu_bo_get_tiling_flags(rbo, _flags); > > > > > - > > > > > - tmz_surface = amdgpu_bo_encrypted(rbo); > > > > > - > > > > > ttm_eu_backoff_reservation(, ); > > > > > afb->address = amdgpu_bo_gpu_offset(rbo); > > > > > amdgpu_bo_ref(rbo); > > > > > + /** > > > > > + * We don't do surface updates on planes that have been newly > > > > > created, > > > > > + * but we also don't have the afb->address during atomic check. > > > > > + * > > > > > + * Fill in buffer attributes depending on the address here, but >
Re: [PATCH 3/7] drm/amd/display: Avoid using unvalidated tiling_flags and tmz_surface in prepare_planes
Am 10.08.20 um 14:25 schrieb Daniel Vetter: On Fri, Aug 07, 2020 at 10:29:09AM -0400, Kazlauskas, Nicholas wrote: On 2020-08-07 4:30 a.m., dan...@ffwll.ch wrote: On Thu, Jul 30, 2020 at 04:36:38PM -0400, Nicholas Kazlauskas wrote: [Why] We're racing with userspace as the flags could potentially change from when we acquired and validated them in commit_check. Uh ... I didn't know these could change. I think my comments on Bas' series are even more relevant now. I think long term would be best to bake these flags in at addfb time when modifiers aren't set. And otherwise always use the modifiers flag, and completely ignore the legacy flags here. -Daniel There's a set tiling/mod flags IOCTL that can be called after addfb happens, so unless there's some sort of driver magic preventing this from working when it's already been pinned for scanout then I don't see anything stopping this from happening. I still need to review the modifiers series in a little more detail but that looks like a good approach to fixing these kind of issues. Yeah we had the same model for i915, but it's awkward and can surprise compositors (since the client could change the tiling mode from underneath the compositor). So freezing the tiling mode at addfb time is the right thing to do, and anyway how things work with modifiers. Ofc maybe good to audit the -amd driver, but hopefully it doesn't do anything silly with changed tiling. If it does, it's kinda sad day. Marek should know this right away, but I think we only set the tilling flags once while exporting the BO and then never change them. Regards, Christian. Btw for i915 we even went a step further, and made the set_tiling ioctl return an error if a framebuffer for that gem_bo existed. Just to make sure we don't ever accidentally break this. Cheers, Daniel Regards, Nicholas Kazlauskas [How] We unfortunately can't drop this function in its entirety from prepare_planes since we don't know the afb->address at commit_check time yet. So instead of querying new tiling_flags and tmz_surface use the ones from the plane_state directly. While we're at it, also update the force_disable_dcc option based on the state from atomic check. Cc: Bhawanpreet Lakha Cc: Rodrigo Siqueira Signed-off-by: Nicholas Kazlauskas --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 ++- 1 file changed, 19 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index bf1881bd492c..f78c09c9585e 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -5794,14 +5794,8 @@ static int dm_plane_helper_prepare_fb(struct drm_plane *plane, struct list_head list; struct ttm_validate_buffer tv; struct ww_acquire_ctx ticket; - uint64_t tiling_flags; uint32_t domain; int r; - bool tmz_surface = false; - bool force_disable_dcc = false; - - dm_plane_state_old = to_dm_plane_state(plane->state); - dm_plane_state_new = to_dm_plane_state(new_state); if (!new_state->fb) { DRM_DEBUG_DRIVER("No FB bound\n"); @@ -5845,27 +5839,35 @@ static int dm_plane_helper_prepare_fb(struct drm_plane *plane, return r; } - amdgpu_bo_get_tiling_flags(rbo, _flags); - - tmz_surface = amdgpu_bo_encrypted(rbo); - ttm_eu_backoff_reservation(, ); afb->address = amdgpu_bo_gpu_offset(rbo); amdgpu_bo_ref(rbo); + /** +* We don't do surface updates on planes that have been newly created, +* but we also don't have the afb->address during atomic check. +* +* Fill in buffer attributes depending on the address here, but only on +* newly created planes since they're not being used by DC yet and this +* won't modify global state. +*/ + dm_plane_state_old = to_dm_plane_state(plane->state); + dm_plane_state_new = to_dm_plane_state(new_state); + if (dm_plane_state_new->dc_state && - dm_plane_state_old->dc_state != dm_plane_state_new->dc_state) { - struct dc_plane_state *plane_state = dm_plane_state_new->dc_state; + dm_plane_state_old->dc_state != dm_plane_state_new->dc_state) { + struct dc_plane_state *plane_state = + dm_plane_state_new->dc_state; + bool force_disable_dcc = !plane_state->dcc.enable; - force_disable_dcc = adev->asic_type == CHIP_RAVEN && adev->in_suspend; fill_plane_buffer_attributes( adev, afb, plane_state->format, plane_state->rotation, - tiling_flags, _state->tiling_info, - _state->plane_size, _state->dcc, - _state->address, tmz_surface, - force_disable_dcc); +
Re: [PATCH 5/7] drm/amd/display: Reset plane for anything that's not a FAST update
On Fri, Aug 07, 2020 at 10:26:51AM -0400, Kazlauskas, Nicholas wrote: > On 2020-08-07 4:34 a.m., dan...@ffwll.ch wrote: > > On Thu, Jul 30, 2020 at 04:36:40PM -0400, Nicholas Kazlauskas wrote: > > > [Why] > > > MEDIUM or FULL updates can require global validation or affect > > > bandwidth. By treating these all simply as surface updates we aren't > > > actually passing this through DC global validation. > > > > > > [How] > > > There's currently no way to pass surface updates through DC global > > > validation, nor do I think it's a good idea to change the interface > > > to accept these. > > > > > > DC global validation itself is currently stateless, and we can move > > > our update type checking to be stateless as well by duplicating DC > > > surface checks in DM based on DRM properties. > > > > > > We wanted to rely on DC automatically determining this since DC knows > > > best, but DM is ultimately what fills in everything into DC plane > > > state so it does need to know as well. > > > > > > There are basically only three paths that we exercise in DM today: > > > > > > 1) Cursor (async update) > > > 2) Pageflip (fast update) > > > 3) Full pipe programming (medium/full updates) > > > > > > Which means that anything that's more than a pageflip really needs to > > > go down path #3. > > > > > > So this change duplicates all the surface update checks based on DRM > > > state instead inside of should_reset_plane(). > > > > > > Next step is dropping dm_determine_update_type_for_commit and we no > > > longer require the old DC state at all for global validation. > > > > I think we do something similar in i915, where we have a "nothing except > > base address changed" fast path, but for anything else we fully compute a > > new state. Obviously you should try to keep global state synchronization > > to a minimum for this step, so it's not entirely only 2 options. > > > > Once we have the states, we compare them and figure out whether we can get > > away with a fast modeset operation (maybe what you guys call medium > > update). Anyway I think being slightly more aggressive with computing full > > state, and then falling back to more optimized update again is a good > > approach. Only risk is if we you have too much synchronization in your > > locking (e.g. modern compositors do like to change tiling and stuff, > > especially once you have modifiers enabled, so this shouldn't cause a sync > > across crtc except when absolutely needed). > > -Daniel > > Sounds like the right approach then. > > We can support tiling changes in the fast path, but the more optimized > version of that last check is really linear <-> tiled. That requires global > validation with DC to revalidate bandwidth and calculate requestor > parameters for HW. So we'll have to stall for some of these changes > unfortunately since we need the full HW state for validation. Yeah I think that's perfectly ok, and probably worth it to still optimize for tiled->tiled changes. If you can. tiled<->untiled only happens for boot-splash, so no one cares, but with modifiers the idea is that tiling changes (especially once we get into the different compression modes amdgpu support, and which Bas' series tries to enable) are fairly common and should be doable without any stalls anywhere. Cheers, Daniel > > Regards, > Nicholas Kazlauskas > > > > > > > > > Optimization can come later so we don't reset DC planes at all for > > > MEDIUM udpates and avoid validation, but we might require some extra > > > checks in DM to achieve this. > > > > > > Cc: Bhawanpreet Lakha > > > Cc: Hersen Wu > > > Signed-off-by: Nicholas Kazlauskas > > > --- > > > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 25 +++ > > > 1 file changed, 25 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > > index 0d5f45742bb5..2cbb29199e61 100644 > > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > > @@ -8336,6 +8336,31 @@ static bool should_reset_plane(struct > > > drm_atomic_state *state, > > > if (old_other_state->crtc != new_other_state->crtc) > > > return true; > > > + /* Src/dst size and scaling updates. */ > > > + if (old_other_state->src_w != new_other_state->src_w || > > > + old_other_state->src_h != new_other_state->src_h || > > > + old_other_state->crtc_w != new_other_state->crtc_w || > > > + old_other_state->crtc_h != new_other_state->crtc_h) > > > + return true; > > > + > > > + /* Rotation / mirroring updates. */ > > > + if (old_other_state->rotation != new_other_state->rotation) > > > + return true; > > > + > > > + /* Blending updates. */ > > > + if (old_other_state->pixel_blend_mode != > > > +
Re: [PATCH 6/8] drm/amd/display: Set DC options from modifiers.
On Wed, Aug 05, 2020 at 09:32:10AM +0200, dan...@ffwll.ch wrote: > On Tue, Aug 04, 2020 at 11:31:17PM +0200, Bas Nieuwenhuizen wrote: > > This sets the DC tiling options from the modifier, if modifiers > > are used for the FB. This patch by itself does not expose the > > support yet though. > > > > There is not much validation yet to limit the scope of this > > patch, but the current validation is at the same level as > > the BO metadata path. > > > > Signed-off-by: Bas Nieuwenhuizen > > --- > > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 109 +- > > 1 file changed, 103 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > index 6ef7f2f8acab..ac913b8f10ef 100644 > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > @@ -3754,6 +3754,93 @@ fill_gfx9_plane_attributes_from_flags(struct > > amdgpu_device *adev, > > return 0; > > } > > > > +static bool > > +modifier_has_dcc(uint64_t modifier) > > +{ > > + return IS_AMD_FMT_MOD(modifier) && AMD_FMT_MOD_GET(DCC, modifier); > > +} > > + > > +static unsigned > > +modifier_gfx9_swizzle_mode(uint64_t modifier) > > +{ > > + if (modifier == DRM_FORMAT_MOD_LINEAR) > > + return 0; > > + > > + return AMD_FMT_MOD_GET(TILE, modifier); > > +} > > + > > +static void > > +fill_gfx9_tiling_info_from_modifier(const struct amdgpu_device *adev, > > + union dc_tiling_info *tiling_info, > > + uint64_t modifier) > > +{ > > + unsigned int mod_bank_xor_bits = AMD_FMT_MOD_GET(BANK_XOR_BITS, > > modifier); > > + unsigned int mod_pipe_xor_bits = AMD_FMT_MOD_GET(PIPE_XOR_BITS, > > modifier); > > + unsigned int pkrs_log2 = AMD_FMT_MOD_GET(PACKERS, modifier); > > + unsigned int pipes_log2 = min(4u, mod_pipe_xor_bits); > > + > > + fill_gfx9_tiling_info_from_device(adev, tiling_info); > > + > > + if (!IS_AMD_FMT_MOD(modifier)) > > + return; > > + > > + tiling_info->gfx9.num_pipes = 1u << pipes_log2; > > + tiling_info->gfx9.num_shader_engines = 1u << (mod_pipe_xor_bits - > > pipes_log2); > > + > > + if (adev->family >= AMDGPU_FAMILY_NV) { > > + tiling_info->gfx9.num_pkrs = 1u << pkrs_log2; > > + } else { > > + tiling_info->gfx9.num_banks = 1u << mod_bank_xor_bits; > > + > > + /* for DCC we know it isn't rb aligned, so rb_per_se doesn't > > matter. */ > > + } > > +} > > + > > +static void > > +block_alignment(unsigned int blocksize_log2, unsigned int *width, unsigned > > int *height) > > +{ > > + unsigned int height_log2 = blocksize_log2 / 2; > > + unsigned int width_log2 = blocksize_log2 - height_log2; > > + > > + *width = 1u << width_log2; > > + *height = 1u << height_log2; > > +} > > + > > +static int > > +fill_gfx9_plane_attributes_from_modifiers(struct amdgpu_device *adev, > > + const struct amdgpu_framebuffer *afb, > > + const enum surface_pixel_format format, > > + const enum dc_rotation_angle rotation, > > + const struct plane_size *plane_size, > > + union dc_tiling_info *tiling_info, > > + struct dc_plane_dcc_param *dcc, > > + struct dc_plane_address *address, > > + const bool force_disable_dcc) > > +{ > > + const uint64_t modifier = afb->base.modifier; > > + int ret; > > + > > + fill_gfx9_tiling_info_from_modifier(adev, tiling_info, modifier); > > + tiling_info->gfx9.swizzle = modifier_gfx9_swizzle_mode(modifier); > > + > > + if (modifier_has_dcc(modifier) && !force_disable_dcc) { > > + uint64_t dcc_address = afb->address + afb->base.offsets[1]; > > + > > + dcc->enable = 1; > > + dcc->meta_pitch = afb->base.pitches[1]; > > + dcc->independent_64b_blks = > > AMD_FMT_MOD_GET(DCC_INDEPENDENT_64B, modifier); > > + > > + address->grph.meta_addr.low_part = lower_32_bits(dcc_address); > > + address->grph.meta_addr.high_part = upper_32_bits(dcc_address); > > + } > > + > > + ret = validate_dcc(adev, format, rotation, tiling_info, dcc, address, > > plane_size); > > + if (ret) > > + return ret; > > + > > + return 0; > > +} > > + > > static int > > fill_plane_buffer_attributes(struct amdgpu_device *adev, > > const struct amdgpu_framebuffer *afb, > > @@ -3823,12 +3910,22 @@ fill_plane_buffer_attributes(struct amdgpu_device > > *adev, > > > > > > if (adev->family >= AMDGPU_FAMILY_AI) { > > - ret = fill_gfx9_plane_attributes_from_flags(adev, afb, format, > > rotation, > > - plane_size, > > tiling_info, dcc, > >
Re: [PATCH 3/7] drm/amd/display: Avoid using unvalidated tiling_flags and tmz_surface in prepare_planes
On Fri, Aug 07, 2020 at 10:29:09AM -0400, Kazlauskas, Nicholas wrote: > On 2020-08-07 4:30 a.m., dan...@ffwll.ch wrote: > > On Thu, Jul 30, 2020 at 04:36:38PM -0400, Nicholas Kazlauskas wrote: > > > [Why] > > > We're racing with userspace as the flags could potentially change > > > from when we acquired and validated them in commit_check. > > > > Uh ... I didn't know these could change. I think my comments on Bas' > > series are even more relevant now. I think long term would be best to bake > > these flags in at addfb time when modifiers aren't set. And otherwise > > always use the modifiers flag, and completely ignore the legacy flags > > here. > > -Daniel > > > > There's a set tiling/mod flags IOCTL that can be called after addfb happens, > so unless there's some sort of driver magic preventing this from working > when it's already been pinned for scanout then I don't see anything stopping > this from happening. > > I still need to review the modifiers series in a little more detail but that > looks like a good approach to fixing these kind of issues. Yeah we had the same model for i915, but it's awkward and can surprise compositors (since the client could change the tiling mode from underneath the compositor). So freezing the tiling mode at addfb time is the right thing to do, and anyway how things work with modifiers. Ofc maybe good to audit the -amd driver, but hopefully it doesn't do anything silly with changed tiling. If it does, it's kinda sad day. Btw for i915 we even went a step further, and made the set_tiling ioctl return an error if a framebuffer for that gem_bo existed. Just to make sure we don't ever accidentally break this. Cheers, Daniel > > Regards, > Nicholas Kazlauskas > > > > > > > [How] > > > We unfortunately can't drop this function in its entirety from > > > prepare_planes since we don't know the afb->address at commit_check > > > time yet. > > > > > > So instead of querying new tiling_flags and tmz_surface use the ones > > > from the plane_state directly. > > > > > > While we're at it, also update the force_disable_dcc option based > > > on the state from atomic check. > > > > > > Cc: Bhawanpreet Lakha > > > Cc: Rodrigo Siqueira > > > Signed-off-by: Nicholas Kazlauskas > > > --- > > > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 ++- > > > 1 file changed, 19 insertions(+), 17 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > > index bf1881bd492c..f78c09c9585e 100644 > > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > > > @@ -5794,14 +5794,8 @@ static int dm_plane_helper_prepare_fb(struct > > > drm_plane *plane, > > > struct list_head list; > > > struct ttm_validate_buffer tv; > > > struct ww_acquire_ctx ticket; > > > - uint64_t tiling_flags; > > > uint32_t domain; > > > int r; > > > - bool tmz_surface = false; > > > - bool force_disable_dcc = false; > > > - > > > - dm_plane_state_old = to_dm_plane_state(plane->state); > > > - dm_plane_state_new = to_dm_plane_state(new_state); > > > if (!new_state->fb) { > > > DRM_DEBUG_DRIVER("No FB bound\n"); > > > @@ -5845,27 +5839,35 @@ static int dm_plane_helper_prepare_fb(struct > > > drm_plane *plane, > > > return r; > > > } > > > - amdgpu_bo_get_tiling_flags(rbo, _flags); > > > - > > > - tmz_surface = amdgpu_bo_encrypted(rbo); > > > - > > > ttm_eu_backoff_reservation(, ); > > > afb->address = amdgpu_bo_gpu_offset(rbo); > > > amdgpu_bo_ref(rbo); > > > + /** > > > + * We don't do surface updates on planes that have been newly created, > > > + * but we also don't have the afb->address during atomic check. > > > + * > > > + * Fill in buffer attributes depending on the address here, but only on > > > + * newly created planes since they're not being used by DC yet and this > > > + * won't modify global state. > > > + */ > > > + dm_plane_state_old = to_dm_plane_state(plane->state); > > > + dm_plane_state_new = to_dm_plane_state(new_state); > > > + > > > if (dm_plane_state_new->dc_state && > > > - dm_plane_state_old->dc_state != > > > dm_plane_state_new->dc_state) { > > > - struct dc_plane_state *plane_state = > > > dm_plane_state_new->dc_state; > > > + dm_plane_state_old->dc_state != dm_plane_state_new->dc_state) { > > > + struct dc_plane_state *plane_state = > > > + dm_plane_state_new->dc_state; > > > + bool force_disable_dcc = !plane_state->dcc.enable; > > > - force_disable_dcc = adev->asic_type == CHIP_RAVEN && > > > adev->in_suspend; > > > fill_plane_buffer_attributes( > > > adev, afb, plane_state->format, > > > plane_state->rotation, > > > -
RE: [PATCH 12/15] drm/amd/display: Use hw lock mgr
The condition in which it’s enabled is hidden from upstream. From: Brandon Wright Sent: Saturday, August 8, 2020 10:11 PM To: Brol, Eryk ; amd-gfx@lists.freedesktop.org; Wood, Wyatt Subject: Re: [PATCH 12/15] drm/amd/display: Use hw lock mgr Just curious, but I noticed this new lock manager isn't being used because of the definitive false return value in display/dc/dce/dmub_hw_lock_mgr.c:should_use_dmub_lock. Was this supposed to be enabled? From: Wyatt Wood mailto:wyatt.w...@amd.com>> [Why] Feature requires synchronization of dig, pipe, and cursor locking between driver and fw. [How] Set flag to force psr to use hw lock mgr. Signed-off-by: Wyatt Wood mailto:wyatt.w...@amd.com>> Reviewed-by: Anthony Koo mailto:anthony@amd.com>> Acked-by: Eryk Brol mailto:eryk.b...@amd.com>> --- drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c index 82e67bd81f2d..5167d6b8a48d 100644 --- a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c +++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c @@ -233,8 +233,8 @@ static bool dmub_psr_copy_settings(struct dmub_psr *dmub, copy_settings_data->frame_cap_ind = psr_context->psrFrameCaptureIndicationReq; copy_settings_data->debug.bitfields.visual_confirm = dc->dc->debug.visual_confirm == VISUAL_CONFIRM_PSR ? true : false; + copy_settings_data->debug.bitfields.use_hw_lock_mgr = 1; copy_settings_data->init_sdp_deadline = psr_context->sdpTransmitLineNumDeadline; - copy_settings_data->debug.bitfields.use_hw_lock_mgr = 0; dc_dmub_srv_cmd_queue(dc->dmub_srv, ); dc_dmub_srv_cmd_execute(dc->dmub_srv); -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: Non-deterministically boot into dark screen with `amdgpu`
Hi guys, Am 10.08.20 um 08:43 schrieb Alexander Monakov: Hi, you should Сс a specialized mailing list and a relevant maintainer, otherwise your email is likely to be ignored as LKML is an incredibly high-volume list. Adding amd-gfx and Alex Deucher. Thanks for forwarding this. AFAIK we haven't heard of this bug before, but Alex already might know more about it. More thoughts below. On Sun, 9 Aug 2020, Ignat Insarov wrote: Hello! This is an issue report. I am not familiar with the Linux kernel development procedure, so please direct me to a more appropriate or specialized medium if this is not the right avenue. My laptop (Ryzen 7 Pro CPU/GPU) boots into dark screen more often than not. Screen blackness correlates with a line in the `systemd` journal that says `RAM width Nbits DDR4`, where N is either 128 (resulting in dark screen) or 64 (resulting in a healthy boot). The number seems to be chosen at random with bias towards 128. This has been going on for a while so here is some statistics: * 356 boots proceed far enough to attempt mode setting. * 82 boots set RAM width to 64 bits and presumably succeed. * 274 boots set RAM width to 128 bits and presumably fail. The issue is prevented with the `nomodeset` kernel option. I reported this previously (about a year ago) on the forum of my Linux distribution.[1] The issue still persists as of linux 5.8.0. The details of my graphics controller, as well as some journal excerpts, can be seen at [1]. One thing that has changed since then is that on failure, there now appears a null pointer dereference error. I am attaching the log of kernel messages from the most recent failed boot — please request more information if needed. I appreciate any directions and advice as to how I may go about fixing this annoyance. [1]: https://bbs.archlinux.org/viewtopic.php?id=248273 On the forum you show that in the "success" case there's one less "BIOS signature incorrect" message. This implies that amdgpu_get_bios() in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c gets the video BIOS from a different source. If that happens every time (one "signature incorrect" message for "success", two for "failure") that may be relevant to the problem you're experiencing. If you don't mind patching and rebuilding the kernel I suggest adding debug printks to the aforementioned function to see exactly which methods fail with wrong signature and which succeeds. Also might be worthwhile to check if there's a BIOS update for your laptop. It might also be a good idea to try the latest amd-staging-drm-next branch from Alex repository (bear with me I don't have the link at hand, but it should be easy to find). Opening a bug report or searching the existing ones for something similar under https://gitlab.freedesktop.org/drm/amd/-/issues might be a good idea as well. And I completely agree that this sounds like an issue getting the BIOS image. Thanks, Christian. Alexander ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: fix reload KMD hang on GFX10 KIQ
Am 10.08.20 um 05:59 schrieb Monk Liu: GFX10 KIQ will hang if we try below steps: modprobe amdgpu rmmod amdgpu modprobe amdgpu sched_hw_submission=4 Due to KIQ is always living there even after KMD unloaded thus when doing the realod KIQ will crash upon its register being programed by different values with the previous loading (the config like HQD addr, ring size, is easily changed if we alter the sched_hw_submission) the fix is we must inactive KIQ first before touching any of its registgers Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 622f442..0702c94 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -6435,6 +6435,10 @@ static int gfx_v10_0_kiq_init_register(struct amdgpu_ring *ring) struct v10_compute_mqd *mqd = ring->mqd_ptr; int j; + /* inactivate the queue */ + if (amdgpu_sriov_vf(adev)) Could you think of any reason why we shouldn't do this on bare metal as well? I mean it can't hurt to be extra careful even if the KIQ shouldn't be running. Christian. + WREG32_SOC15(GC, 0, mmCP_HQD_ACTIVE, 0); + /* disable wptr polling */ WREG32_FIELD15(GC, 0, CP_PQ_WPTR_POLL_CNTL, EN, 0); ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] dma-buf.rst: repair length of title underline
Am 09.08.20 um 08:17 schrieb Lukas Bulwahn: With commit 72b6ede73623 ("dma-buf.rst: Document why indefinite fences are a bad idea"), document generation warns: Documentation/driver-api/dma-buf.rst:182: \ WARNING: Title underline too short. Repair length of title underline to remove warning. Fixes: 72b6ede73623 ("dma-buf.rst: Document why indefinite fences are a bad idea") Signed-off-by: Lukas Bulwahn Acked-by: Christian König Should I pick it up into drm-misc-next? --- Daniel, please pick this minor non-urgent fix to your new documentation. Documentation/driver-api/dma-buf.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst index 100bfd227265..13ea0cc0a3fa 100644 --- a/Documentation/driver-api/dma-buf.rst +++ b/Documentation/driver-api/dma-buf.rst @@ -179,7 +179,7 @@ DMA Fence uABI/Sync File :internal: Indefinite DMA Fences - +~ At various times _fence with an indefinite time until dma_fence_wait() finishes have been proposed. Examples include: ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64
Am 10.08.20 um 12:50 schrieb Michel Dänzer: On 2020-08-09 2:13 p.m., Christian König wrote: Am 08.08.20 um 15:50 schrieb Jiaxun Yang: 在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道: On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote: Loongson processors have a writecombine issue that maybe failed to write back framebuffer used with ATI Radeon or AMD GPU at times, after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine mapping for MIPS"), there exists some errors such as blurred screen and lockup, and so on. Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any influence on the other platforms. well it's not my call to take or reject this patch, but I already indicated it might be better to disable writecombine on the CPU detection side (or do you have other devices where writecombining works ?). Something like below will disbale it for all loongson64 CPUs. If you now find out where it works and where it doesn't, you can even reduce it to the required minium of affected CPUs. Hi Tiezhu, Thomas, Yes, writecombine works well on LS7A's internal GPU And even works well with some AMD GPUs (in my case, RX550). In this case the patch is a clear NAK since you haven't root caused the issue and are just working around it in a very questionable manner. To be fair though, amdgpu & radeon are already disabling write-combining for system memory pages in 32-bit x86 kernels for similar reasons. Yeah, well that is USWC for system memory. But this is about WC for the VRAM BAR. When we don't understand or don't correctly implement something on the platform for USWC then this is annoying, but not a serious issue. But when the hardware doesn't correctly implement WC for PCIe BARs, then this is a violation of the PCIe spec and a bit more serious issue for the whole platform. We can work around that by disabling WC for PCIe BARs on the whole platform, or behind specific bridges or or or, but patching each individual driver so that they work is not really the right approach. Cheers, Christian. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64
On 2020-08-09 2:13 p.m., Christian König wrote: > Am 08.08.20 um 15:50 schrieb Jiaxun Yang: >> 在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道: >>> On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote: Loongson processors have a writecombine issue that maybe failed to write back framebuffer used with ATI Radeon or AMD GPU at times, after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine mapping for MIPS"), there exists some errors such as blurred screen and lockup, and so on. Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any influence on the other platforms. >>> well it's not my call to take or reject this patch, but I already >>> indicated it might be better to disable writecombine on the CPU >>> detection side (or do you have other devices where writecombining >>> works ?). Something like below will disbale it for all loongson64 CPUs. >>> If you now find out where it works and where it doesn't, you can even >>> reduce it to the required minium of affected CPUs. >> Hi Tiezhu, Thomas, >> >> Yes, writecombine works well on LS7A's internal GPU >> And even works well with some AMD GPUs (in my case, RX550). > > In this case the patch is a clear NAK since you haven't root caused the > issue and are just working around it in a very questionable manner. To be fair though, amdgpu & radeon are already disabling write-combining for system memory pages in 32-bit x86 kernels for similar reasons. -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm: amdgpu: Use the correct size when allocating memory
Am 09.08.20 um 22:34 schrieb Christophe JAILLET: When '*sgt' is allocated, we must allocated 'sizeof(**sgt)' bytes instead of 'sizeof(*sg)'. 'sg' (i.e. struct scatterlist) is smaller than 'sgt' (i.e struct sg_table), so this could lead to memory corruption. Fixes: f44ffd677fb3 ("drm/amdgpu: add support for exporting VRAM using DMA-buf v3") Signed-off-by: Christophe JAILLET Good catch, Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c index 134cc36e30c5..0739e259bf91 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -462,7 +462,7 @@ int amdgpu_vram_mgr_alloc_sgt(struct amdgpu_device *adev, unsigned int pages; int i, r; - *sgt = kmalloc(sizeof(*sg), GFP_KERNEL); + *sgt = kmalloc(sizeof(**sgt), GFP_KERNEL); if (!*sgt) return -ENOMEM; ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: Non-deterministically boot into dark screen with `amdgpu`
Hi, you should Сс a specialized mailing list and a relevant maintainer, otherwise your email is likely to be ignored as LKML is an incredibly high-volume list. Adding amd-gfx and Alex Deucher. More thoughts below. On Sun, 9 Aug 2020, Ignat Insarov wrote: > Hello! > > This is an issue report. I am not familiar with the Linux kernel > development procedure, so please direct me to a more appropriate or > specialized medium if this is not the right avenue. > > My laptop (Ryzen 7 Pro CPU/GPU) boots into dark screen more often than > not. Screen blackness correlates with a line in the `systemd` journal > that says `RAM width Nbits DDR4`, where N is either 128 (resulting in > dark screen) or 64 (resulting in a healthy boot). The number seems to > be chosen at random with bias towards 128. This has been going on for > a while so here is some statistics: > > * 356 boots proceed far enough to attempt mode setting. > * 82 boots set RAM width to 64 bits and presumably succeed. > * 274 boots set RAM width to 128 bits and presumably fail. > > The issue is prevented with the `nomodeset` kernel option. > > I reported this previously (about a year ago) on the forum of my Linux > distribution.[1] The issue still persists as of linux 5.8.0. > > The details of my graphics controller, as well as some journal > excerpts, can be seen at [1]. One thing that has changed since then is > that on failure, there now appears a null pointer dereference error. I > am attaching the log of kernel messages from the most recent failed > boot — please request more information if needed. > > I appreciate any directions and advice as to how I may go about fixing > this annoyance. > > [1]: https://bbs.archlinux.org/viewtopic.php?id=248273 On the forum you show that in the "success" case there's one less "BIOS signature incorrect" message. This implies that amdgpu_get_bios() in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c gets the video BIOS from a different source. If that happens every time (one "signature incorrect" message for "success", two for "failure") that may be relevant to the problem you're experiencing. If you don't mind patching and rebuilding the kernel I suggest adding debug printks to the aforementioned function to see exactly which methods fail with wrong signature and which succeeds. Also might be worthwhile to check if there's a BIOS update for your laptop. Alexander___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64
On 08/09/2020 08:13 PM, Christian König wrote: Am 08.08.20 um 15:50 schrieb Jiaxun Yang: 在 2020/8/8 下午9:41, Thomas Bogendoerfer 写道: On Sat, Aug 08, 2020 at 03:25:02PM +0800, Tiezhu Yang wrote: Loongson processors have a writecombine issue that maybe failed to write back framebuffer used with ATI Radeon or AMD GPU at times, after commit 8a08e50cee66 ("drm: Permit video-buffers writecombine mapping for MIPS"), there exists some errors such as blurred screen and lockup, and so on. Remove the flag TTM_PL_FLAG_WC of VRAM to fix writecombine issue for Loongson64 to work well with ATI Radeon or AMD GPU, and it has no any influence on the other platforms. well it's not my call to take or reject this patch, but I already indicated it might be better to disable writecombine on the CPU detection side (or do you have other devices where writecombining works ?). Something like below will disbale it for all loongson64 CPUs. If you now find out where it works and where it doesn't, you can even reduce it to the required minium of affected CPUs. Hi Tiezhu, Thomas, Yes, writecombine works well on LS7A's internal GPU And even works well with some AMD GPUs (in my case, RX550). In this case the patch is a clear NAK since you haven't root caused the issue and are just working around it in a very questionable manner. Tiezhu, is it possible to investigate the issue deeper in Loongson? Probably we just need to add some barrier to maintain the data coherency, or disable writecombine for AMD GPU's command buffer and leave texture/frame buffer wc accelerated. Have you moved any buffer to VRAM and forgot to add an HDP flush/invalidate? The acceleration is not much of a problem, but if WC doesn't work in general you need to disable it for the whole CPU and not for individual drivers. Hi Thomas, Jiaxun and Christian, Thank you very much for your suggestions. Actually, this patch is a temporary solution to just make it work well, it is not a proper and final solution. I understand your opinions, it will take some time to find the root cause. Thanks, Tiezhu Regards, Christian. Thanks. - Jiaxun ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 2/2] drm/amdgpu: add debugfs node to toggle ras error cnt harvest
[AMD Public Use] Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Chen, Guchun Sent: Monday, August 10, 2020 13:23 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Li, Dennis ; Lazar, Lijo ; Zhou1, Tao ; Clements, John Cc: Chen, Guchun Subject: [PATCH 2/2] drm/amdgpu: add debugfs node to toggle ras error cnt harvest Before ras recovery is issued, user could operate this debugfs node to enable/disable the harvest of all RAS IPs' ras error count registers, which will help keep hardware's registers' status instead of cleaning up them. Signed-off-by: Guchun Chen --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index e6978b8e2143..31df6bf2dc1f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -1215,6 +1215,13 @@ static void amdgpu_ras_debugfs_create_ctrl_node(struct amdgpu_device *adev) */ debugfs_create_bool("auto_reboot", S_IWUGO | S_IRUGO, con->dir, >reboot); + + /* +* User could set this not to clean up hardware's error count register +* of RAS IPs during ras recovery. +*/ + debugfs_create_bool("disable_ras_err_cnt_harvest", 0644, + con->dir, >disable_ras_err_cnt_harvest); } void amdgpu_ras_debugfs_create(struct amdgpu_device *adev, -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx