Re: [RFC] a new approach to detect which ring is the real black sheep upon TDR reported
Hi Monk, in general an interesting idea, but I see two major problems with that: 1. It would make the reset take much longer. 2. Things get often stuck because of timing issues, so a guilty job might pass perfectly when run a second time. Apart from that the whole ring mirror list turned out to be a really bad idea. E.g. we still struggle with object life time because the concept doesn't fit into the object model of the GPU scheduler under Linux. We should probably work on this separately and straighten up the job destruction once more and keep the recovery information in the fence instead. Regards, Christian. Am 26.02.21 um 06:58 schrieb Liu, Monk: [AMD Public Use] Hi all NAVI2X project hit a really hard to solve issue now, and it is turned out to be a general headache of our TDR mechanism , check below scenario: 1. There is a job1 running on compute1 ring at timestamp 2. There is a job2 running on gfx ring at timestamp 3. Job1 is the guilty one, and job1/job2 were scheduled to their rings at almost the same timestamp 4. After 2 seconds we receive two TDR reporting from both GFX ring and compute ring 5. *Current scheme is that in drm scheduler all the head jobs of those two rings are considered “bad job” and taken away from the mirror list * 6. The result is both the real guilty job (job1) and the innocent job (job2) were all deleted from mirror list, and their corresponding contexts were also treated as guilty*(so the innocent process remains running is not secured)* ** But by our wish the ideal case is TDR mechanism can detect which ring is the guilty ring and the innocent ring can resubmits all its pending jobs: 1. Job1 to be deleted from compute1 ring’s mirror list 2. Job2 is kept and resubmitted later and its belonging process/context are even not aware of this TDR at all Here I have a proposal tend to achieve above goal and it rough procedure is : 1. Once any ring reports a TDR, the head job is **not** treated as “bad job”, and it is **not** deleted from the mirror list in drm sched functions 2. In vendor’s function (our amdgpu driver here): * reset GPU * repeat below actions on each RINGS * one by one *: 1.take the head job and submit it on this ring 2.see if it completes, if not then this job is the real “bad job” 3. take it away from mirror list if this head job is “bad job” * After above iteration on all RINGS, we already clears all the bad job(s) 2. Resubmit all jobs from each mirror list to their corresponding rings (this is the existed logic) The idea of this is to use “serial” way to re-run and re-check each head job of each RING, in order to take out the real black sheep and its guilty context. P.S.: we can use this approaches only on GFX/KCQ ring reports TDR , since those rings are intermutually affected to each other. For SDMA ring timeout it definitely proves the head job on SDMA ring is really guilty. Thanks -- Monk Liu | Cloud-GPU Core team -- ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 0/3] drm/ttm: constify static vm_operations_structs
Am 23.02.21 um 18:31 schrieb Alex Deucher: On Wed, Feb 10, 2021 at 8:14 AM Daniel Vetter wrote: On Wed, Feb 10, 2021 at 08:45:56AM +0100, Christian König wrote: Reviewed-by: Christian König for the series. Smash it into -misc? @Christian Koenig did these ever land? I don't see them in drm-misc. I've just pushed them to drm-misc-next. Sorry for the delay, totally forgot about them. Christian. Alex -Daniel Am 10.02.21 um 00:48 schrieb Rikard Falkeborn: Constify a few static vm_operations_struct that are never modified. Their only usage is to assign their address to the vm_ops field in the vm_area_struct, which is a pointer to const vm_operations_struct. Make them const to allow the compiler to put them in read-only memory. With this series applied, all static struct vm_operations_struct in the kernel tree are const. Rikard Falkeborn (3): drm/amdgpu/ttm: constify static vm_operations_struct drm/radeon/ttm: constify static vm_operations_struct drm/nouveau/ttm: constify static vm_operations_struct drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +- drivers/gpu/drm/nouveau/nouveau_ttm.c | 2 +- drivers/gpu/drm/radeon/radeon_ttm.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) -- Daniel Vetter Software Engineer, Intel Corporation https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7C9d730e56efe54d3215ee08d8d820d642%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637496982837619645%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=b4UU8bzeX%2Ba1VfObe8mta7fwtjVv%2F1wo4%2FPVuGZFW8Q%3D&reserved=0 ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=04%7C01%7Cchristian.koenig%40amd.com%7C9d730e56efe54d3215ee08d8d820d642%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637496982837629638%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=RKJh6p%2BTxaD0lH6M%2B0s3nah3tBatRFqoTvy3Mh7Lz5M%3D&reserved=0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] radeon: ERROR: space prohibited before that ','
Well coding style clean ups are usually welcome, but not necessarily one by one. We can probably merge this if you clean up all checkpatch.pl warnings in the whole file. Christian. Am 26.02.21 um 07:05 schrieb wangjingyu: drm_property_create_range(rdev->ddev, 0 , "coherent", 0, 1); Signed-off-by: wangjingyu --- drivers/gpu/drm/radeon/radeon_display.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c index 3a6fedad002d..439d1b3e87d8 100644 --- a/drivers/gpu/drm/radeon/radeon_display.c +++ b/drivers/gpu/drm/radeon/radeon_display.c @@ -1396,7 +1396,7 @@ static int radeon_modeset_create_props(struct radeon_device *rdev) if (rdev->is_atom_bios) { rdev->mode_info.coherent_mode_property = - drm_property_create_range(rdev->ddev, 0 , "coherent", 0, 1); + drm_property_create_range(rdev->ddev, 0, "coherent", 0, 1); if (!rdev->mode_info.coherent_mode_property) return -ENOMEM; } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amd/pm: optimize the link width/speed retrieving V2
By using the information provided by PMFW when available. V2: put those structures shared around SMU V11 ASICs in smu_v11_0.h Change-Id: I1afec4cd07ac9608861ee07c449e320e3f36a932 Signed-off-by: Evan Quan Acked-by: Alex Deucher --- drivers/gpu/drm/amd/pm/inc/smu_v11_0.h| 10 -- .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 12 --- .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 20 +++ .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 10 ++ 4 files changed, 30 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h index 50dd1529b994..d400f75e9202 100644 --- a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h +++ b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h @@ -58,6 +58,12 @@ #define CTF_OFFSET_HOTSPOT 5 #define CTF_OFFSET_MEM 5 +#define LINK_WIDTH_MAX 6 +#define LINK_SPEED_MAX 3 + +static __maybe_unused uint8_t link_width[] = {0, 1, 2, 4, 8, 12, 16}; +static __maybe_unused uint8_t link_speed[] = {25, 50, 80, 160}; + static const struct smu_temperature_range __maybe_unused smu11_thermal_policy[] = { @@ -284,11 +290,11 @@ int smu_v11_0_get_dpm_level_range(struct smu_context *smu, int smu_v11_0_get_current_pcie_link_width_level(struct smu_context *smu); -int smu_v11_0_get_current_pcie_link_width(struct smu_context *smu); +uint8_t smu_v11_0_get_current_pcie_link_width(struct smu_context *smu); int smu_v11_0_get_current_pcie_link_speed_level(struct smu_context *smu); -int smu_v11_0_get_current_pcie_link_speed(struct smu_context *smu); +uint8_t smu_v11_0_get_current_pcie_link_speed(struct smu_context *smu); int smu_v11_0_gfx_ulv_control(struct smu_context *smu, bool enablement); diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c index ffd37b8a3882..f71723c345a8 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c @@ -2716,10 +2716,8 @@ static ssize_t navi10_get_gpu_metrics(struct smu_context *smu, gpu_metrics->current_fan_speed = metrics.CurrFanSpeed; - gpu_metrics->pcie_link_width = - smu_v11_0_get_current_pcie_link_width(smu); - gpu_metrics->pcie_link_speed = - smu_v11_0_get_current_pcie_link_speed(smu); + gpu_metrics->pcie_link_width = metrics.PcieWidth; + gpu_metrics->pcie_link_speed = link_speed[metrics.PcieRate]; gpu_metrics->system_clock_counter = ktime_get_boottime_ns(); @@ -2856,10 +2854,8 @@ static ssize_t navi12_get_gpu_metrics(struct smu_context *smu, gpu_metrics->current_fan_speed = metrics.CurrFanSpeed; - gpu_metrics->pcie_link_width = - smu_v11_0_get_current_pcie_link_width(smu); - gpu_metrics->pcie_link_speed = - smu_v11_0_get_current_pcie_link_speed(smu); + gpu_metrics->pcie_link_width = metrics.PcieWidth; + gpu_metrics->pcie_link_speed = link_speed[metrics.PcieRate]; gpu_metrics->system_clock_counter = ktime_get_boottime_ns(); diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c index e74299da1739..527e02b578af 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c @@ -3124,6 +3124,8 @@ static ssize_t sienna_cichlid_get_gpu_metrics(struct smu_context *smu, SmuMetricsExternal_t metrics_external; SmuMetrics_t *metrics = &(metrics_external.SmuMetrics); + struct amdgpu_device *adev = smu->adev; + uint32_t smu_version; int ret = 0; ret = smu_cmn_get_metrics_table(smu, @@ -3170,10 +3172,20 @@ static ssize_t sienna_cichlid_get_gpu_metrics(struct smu_context *smu, gpu_metrics->current_fan_speed = metrics->CurrFanSpeed; - gpu_metrics->pcie_link_width = - smu_v11_0_get_current_pcie_link_width(smu); - gpu_metrics->pcie_link_speed = - smu_v11_0_get_current_pcie_link_speed(smu); + ret = smu_cmn_get_smc_version(smu, NULL, &smu_version); + if (ret) + return ret; + + if (((adev->asic_type == CHIP_SIENNA_CICHLID) && smu_version > 0x003A1E00) || + ((adev->asic_type == CHIP_NAVY_FLOUNDER) && smu_version > 0x00410400)) { + gpu_metrics->pcie_link_width = metrics->PcieWidth; + gpu_metrics->pcie_link_speed = link_speed[metrics->PcieRate]; + } else { + gpu_metrics->pcie_link_width = + smu_v11_0_get_current_pcie_link_width(smu); + gpu_metrics->pcie_link_speed = + smu_v11_0_get_current_pcie_link_speed(smu); + } gpu_metrics->system_clock_cou
[PATCH] drm/amd/pm: correct gpu metrics related data structures V2
To make sure they are naturally aligned. V2: minimum the possible influence to existing applications which were developed based on those data structures. With this change, only 32bit OS are affected while 64bit OS not. Change-Id: I0a139e1e1f09fe27deffdce1cec6ea4594947625 Signed-off-by: Evan Quan --- drivers/gpu/drm/amd/include/kgd_pp_interface.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h b/drivers/gpu/drm/amd/include/kgd_pp_interface.h index 41c89f7d6412..ca38a204beb0 100644 --- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h +++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h @@ -339,6 +339,8 @@ struct metrics_table_header { uint16_tstructure_size; uint8_t format_revision; uint8_t content_revision; + /* make the data structure naturely aligned for 64bit OS */ + uint16_tpadding[2]; }; struct gpu_metrics_v1_0 { -- 2.29.0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH v2] drm/amdgpu: remove unnecessary reading for epprom header
[AMD Public Use] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Dennis Li Sent: Friday, February 26, 2021 14:42 To: amd-gfx@lists.freedesktop.org; Chen, Guchun ; Zhang, Hawking ; Koenig, Christian Cc: Li, Dennis Subject: [PATCH v2] drm/amdgpu: remove unnecessary reading for epprom header If the number of badpage records exceed the threshold, driver has updated both epprom header and control->tbl_hdr.header before gpu reset, therefore GPU recovery thread no need to read epprom header directly. v2: merge amdgpu_ras_check_err_threshold into amdgpu_ras_eeprom_check_err_threshold Signed-off-by: Dennis Li diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f0f7ed42ee7f..f2ff10403d93 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4397,7 +4397,7 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive, * bad_page_threshold value to fix this once * probing driver again. */ - if (!amdgpu_ras_check_err_threshold(tmp_adev)) { + if (!amdgpu_ras_eeprom_check_err_threshold(tmp_adev)) { /* must succeed. */ amdgpu_ras_resume(tmp_adev); } else { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index 09546dec40ff..c669435ccc74 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -2189,19 +2189,3 @@ bool amdgpu_ras_need_emergency_restart(struct amdgpu_device *adev) return false; } - -bool amdgpu_ras_check_err_threshold(struct amdgpu_device *adev) -{ - struct amdgpu_ras *con = amdgpu_ras_get_context(adev); - bool exc_err_limit = false; - - if (con && (amdgpu_bad_page_threshold != 0)) - amdgpu_ras_eeprom_check_err_threshold(&con->eeprom_control, - &exc_err_limit); - - /* -* We are only interested in variable exc_err_limit, -* as it says if GPU is in bad state or not. -*/ - return exc_err_limit; -} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h index aed0716efa5a..42aab9adc263 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h @@ -491,8 +491,6 @@ void amdgpu_ras_suspend(struct amdgpu_device *adev); unsigned long amdgpu_ras_query_error_count(struct amdgpu_device *adev, bool is_ce); -bool amdgpu_ras_check_err_threshold(struct amdgpu_device *adev); - /* error handling functions */ int amdgpu_ras_add_bad_pages(struct amdgpu_device *adev, struct eeprom_table_record *bps, int pages); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index 19d9aa76cfbf..7f527f8bbdb1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c @@ -434,47 +434,21 @@ static uint32_t __correct_eeprom_dest_address(uint32_t curr_address) return curr_address; } -int amdgpu_ras_eeprom_check_err_threshold( - struct amdgpu_ras_eeprom_control *control, - bool *exceed_err_limit) +bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev) { - struct amdgpu_device *adev = to_amdgpu_device(control); - unsigned char buff[EEPROM_ADDRESS_SIZE + - EEPROM_TABLE_HEADER_SIZE] = { 0 }; - struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr; - struct i2c_msg msg = { - .addr = control->i2c_address, - .flags = I2C_M_RD, - .len = EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE, - .buf = buff, - }; - int ret; - - *exceed_err_limit = false; + struct amdgpu_ras *con = amdgpu_ras_get_context(adev); if (!__is_ras_eeprom_supported(adev)) - return 0; - - /* read EEPROM table header */ - mutex_lock(&control->tbl_mutex); - ret = i2c_transfer(&adev->pm.smu_i2c, &msg, 1); - if (ret < 1) { - dev_err(adev->dev, "Failed to read EEPROM table header.\n"); - goto err; - } - - __decode_table_header_from_buff(hdr, &buff[2]); + return false; - if (hdr->header == EEPROM_TABLE_HDR_BAD) { + if (con->eeprom_control.tbl_hdr.header == EEPROM_TABLE_HDR_BAD) { dev_warn(adev->dev, "This GPU is in BAD status."); dev_warn(adev->dev, "Please retire it or setting one bigger " "threshold val
[PATCH v2] drm/amdgpu: remove unnecessary reading for epprom header
If the number of badpage records exceed the threshold, driver has updated both epprom header and control->tbl_hdr.header before gpu reset, therefore GPU recovery thread no need to read epprom header directly. v2: merge amdgpu_ras_check_err_threshold into amdgpu_ras_eeprom_check_err_threshold Signed-off-by: Dennis Li diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f0f7ed42ee7f..f2ff10403d93 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4397,7 +4397,7 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive, * bad_page_threshold value to fix this once * probing driver again. */ - if (!amdgpu_ras_check_err_threshold(tmp_adev)) { + if (!amdgpu_ras_eeprom_check_err_threshold(tmp_adev)) { /* must succeed. */ amdgpu_ras_resume(tmp_adev); } else { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index 09546dec40ff..c669435ccc74 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -2189,19 +2189,3 @@ bool amdgpu_ras_need_emergency_restart(struct amdgpu_device *adev) return false; } - -bool amdgpu_ras_check_err_threshold(struct amdgpu_device *adev) -{ - struct amdgpu_ras *con = amdgpu_ras_get_context(adev); - bool exc_err_limit = false; - - if (con && (amdgpu_bad_page_threshold != 0)) - amdgpu_ras_eeprom_check_err_threshold(&con->eeprom_control, - &exc_err_limit); - - /* -* We are only interested in variable exc_err_limit, -* as it says if GPU is in bad state or not. -*/ - return exc_err_limit; -} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h index aed0716efa5a..42aab9adc263 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h @@ -491,8 +491,6 @@ void amdgpu_ras_suspend(struct amdgpu_device *adev); unsigned long amdgpu_ras_query_error_count(struct amdgpu_device *adev, bool is_ce); -bool amdgpu_ras_check_err_threshold(struct amdgpu_device *adev); - /* error handling functions */ int amdgpu_ras_add_bad_pages(struct amdgpu_device *adev, struct eeprom_table_record *bps, int pages); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index 19d9aa76cfbf..7f527f8bbdb1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c @@ -434,47 +434,21 @@ static uint32_t __correct_eeprom_dest_address(uint32_t curr_address) return curr_address; } -int amdgpu_ras_eeprom_check_err_threshold( - struct amdgpu_ras_eeprom_control *control, - bool *exceed_err_limit) +bool amdgpu_ras_eeprom_check_err_threshold(struct amdgpu_device *adev) { - struct amdgpu_device *adev = to_amdgpu_device(control); - unsigned char buff[EEPROM_ADDRESS_SIZE + - EEPROM_TABLE_HEADER_SIZE] = { 0 }; - struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr; - struct i2c_msg msg = { - .addr = control->i2c_address, - .flags = I2C_M_RD, - .len = EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE, - .buf = buff, - }; - int ret; - - *exceed_err_limit = false; + struct amdgpu_ras *con = amdgpu_ras_get_context(adev); if (!__is_ras_eeprom_supported(adev)) - return 0; - - /* read EEPROM table header */ - mutex_lock(&control->tbl_mutex); - ret = i2c_transfer(&adev->pm.smu_i2c, &msg, 1); - if (ret < 1) { - dev_err(adev->dev, "Failed to read EEPROM table header.\n"); - goto err; - } - - __decode_table_header_from_buff(hdr, &buff[2]); + return false; - if (hdr->header == EEPROM_TABLE_HDR_BAD) { + if (con->eeprom_control.tbl_hdr.header == EEPROM_TABLE_HDR_BAD) { dev_warn(adev->dev, "This GPU is in BAD status."); dev_warn(adev->dev, "Please retire it or setting one bigger " "threshold value when reloading driver.\n"); - *exceed_err_limit = true; + return true; } -err: - mutex_unlock(&control->tbl_mutex); - return 0; + return false; } int amdgpu_ras_eeprom_process_recods(struct amdgpu_ras_eeprom_control *control, diff --git a/drivers/gpu/drm/amd/amdgpu/amd
[PATCH] drm/amd/pm: bump Navi1x driver if version and related data structures V2
New changes were involved for the SmuMetrics structure. Change-Id: Ib45443db03977ccd18618bcfdfd3574ac13d50d1 Signed-off-by: Evan Quan --- .../drm/amd/pm/inc/smu11_driver_if_navi10.h | 98 ++- drivers/gpu/drm/amd/pm/inc/smu_v11_0.h| 6 +- .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 609 +- 3 files changed, 673 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/inc/smu11_driver_if_navi10.h b/drivers/gpu/drm/amd/pm/inc/smu11_driver_if_navi10.h index 246d3951a78a..04752ade1016 100644 --- a/drivers/gpu/drm/amd/pm/inc/smu11_driver_if_navi10.h +++ b/drivers/gpu/drm/amd/pm/inc/smu11_driver_if_navi10.h @@ -843,11 +843,15 @@ typedef struct { uint16_t FanMaximumRpm; uint16_t FanMinimumPwm; uint16_t FanTargetTemperature; // Degree Celcius + uint16_t FanMode; + uint16_t FanMaxPwm; + uint16_t FanMinPwm; + uint16_t FanMaxTemp; // Degree Celcius + uint16_t FanMinTemp; // Degree Celcius uint16_t MaxOpTemp;// Degree Celcius uint16_t FanZeroRpmEnable; - uint16_t Padding; - uint32_t MmHubPadding[8]; // SMU internal use + uint32_t MmHubPadding[6]; // SMU internal use } OverDriveTable_t; @@ -880,6 +884,45 @@ typedef struct { uint8_t Padding8_2; uint16_t CurrFanSpeed; + // Padding - ignore + uint32_t MmHubPadding[8]; // SMU internal use +} SmuMetrics_legacy_t; + +typedef struct { + uint16_t CurrClock[PPCLK_COUNT]; + uint16_t AverageGfxclkFrequencyPostDs; + uint16_t AverageSocclkFrequency; + uint16_t AverageUclkFrequencyPostDs; + uint16_t AverageGfxActivity; + uint16_t AverageUclkActivity ; + uint8_t CurrSocVoltageOffset ; + uint8_t CurrGfxVoltageOffset ; + uint8_t CurrMemVidOffset ; + uint8_t Padding8 ; + uint16_t AverageSocketPower; + uint16_t TemperatureEdge ; + uint16_t TemperatureHotspot; + uint16_t TemperatureMem; + uint16_t TemperatureVrGfx ; + uint16_t TemperatureVrMem0 ; + uint16_t TemperatureVrMem1 ; + uint16_t TemperatureVrSoc ; + uint16_t TemperatureLiquid0; + uint16_t TemperatureLiquid1; + uint16_t TemperaturePlx; + uint16_t Padding16 ; + uint32_t ThrottlerStatus ; + + uint8_t LinkDpmLevel; + uint8_t Padding8_2; + uint16_t CurrFanSpeed; + + uint16_t AverageGfxclkFrequencyPreDs; + uint16_t AverageUclkFrequencyPreDs; + uint8_t PcieRate; + uint8_t PcieWidth; + uint8_t Padding8_3[2]; + // Padding - ignore uint32_t MmHubPadding[8]; // SMU internal use } SmuMetrics_t; @@ -919,10 +962,61 @@ typedef struct { uint16_t VcnActivityPercentage ; uint16_t padding16_2; + // Padding - ignore + uint32_t MmHubPadding[8]; // SMU internal use +} SmuMetrics_NV12_legacy_t; + +typedef struct { + uint16_t CurrClock[PPCLK_COUNT]; + uint16_t AverageGfxclkFrequencyPostDs; + uint16_t AverageSocclkFrequency; + uint16_t AverageUclkFrequencyPostDs; + uint16_t AverageGfxActivity; + uint16_t AverageUclkActivity ; + uint8_t CurrSocVoltageOffset ; + uint8_t CurrGfxVoltageOffset ; + uint8_t CurrMemVidOffset ; + uint8_t Padding8 ; + uint16_t AverageSocketPower; + uint16_t TemperatureEdge ; + uint16_t TemperatureHotspot; + uint16_t TemperatureMem; + uint16_t TemperatureVrGfx ; + uint16_t TemperatureVrMem0 ; + uint16_t TemperatureVrMem1 ; + uint16_t TemperatureVrSoc ; + uint16_t TemperatureLiquid0; + uint16_t TemperatureLiquid1; + uint16_t TemperaturePlx; + uint16_t Padding16 ; + uint32_t ThrottlerStatus ; + + uint8_t LinkDpmLevel; + uint8_t Padding8_2; + uint16_t CurrFanSpeed; + + uint16_t AverageVclkFrequency ; + uint16_t AverageDclkFrequency ; + uint16_t VcnActivityPercentage ; + uint16_t AverageGfxclkFrequencyPreDs; + uint16_t AverageUclkFrequencyPreDs; + uint8_t PcieRate; + uint8_t PcieWidth; + + uint32_t Padding32_1; + uint64_t EnergyAccumulator; + // Padding - ignore uint32_t MmHubPadding[8]; // SMU internal use } SmuMetrics_NV12_t; +typedef union SmuMetrics { + SmuMetrics_legacy_t nv10_legacy_metrics; + SmuMetrics_tnv10_metrics; + SmuMetrics_NV12_legacy_tnv12_legacy_metrics; + SmuMetrics_NV12_t nv12_metrics; +} SmuMetrics_NV1X_t; + typedef struct { uint16_t MinClock; // This is either DCEFCLK or SOCCLK (in MHz) uint16_t MaxClock; // This is either DCEFCLK or SOCCLK (in MHz) diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h index 281835f23f6d..50dd1529b994 100644 --- a/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h +++ b/drivers/gpu/drm/amd/pm/inc/smu_v11_0.h @@ -27,9 +27,9 @@ #define SMU11_DRIVER_IF_VERSION_INV 0x #define SMU11_DRIVER_IF_VERSION_ARCT 0x17 -#define SMU11_DRIVER_IF_VERSION_NV10 0x36 -#define SMU11_DRIVE
[RFC] a new approach to detect which ring is the real black sheep upon TDR reported
[AMD Public Use] Hi all NAVI2X project hit a really hard to solve issue now, and it is turned out to be a general headache of our TDR mechanism , check below scenario: 1. There is a job1 running on compute1 ring at timestamp 2. There is a job2 running on gfx ring at timestamp 3. Job1 is the guilty one, and job1/job2 were scheduled to their rings at almost the same timestamp 4. After 2 seconds we receive two TDR reporting from both GFX ring and compute ring 5. Current scheme is that in drm scheduler all the head jobs of those two rings are considered "bad job" and taken away from the mirror list 6. The result is both the real guilty job (job1) and the innocent job (job2) were all deleted from mirror list, and their corresponding contexts were also treated as guilty (so the innocent process remains running is not secured) But by our wish the ideal case is TDR mechanism can detect which ring is the guilty ring and the innocent ring can resubmits all its pending jobs: 1. Job1 to be deleted from compute1 ring's mirror list 2. Job2 is kept and resubmitted later and its belonging process/context are even not aware of this TDR at all Here I have a proposal tend to achieve above goal and it rough procedure is : 1. Once any ring reports a TDR, the head job is *not* treated as "bad job", and it is *not* deleted from the mirror list in drm sched functions 2. In vendor's function (our amdgpu driver here): * reset GPU * repeat below actions on each RINGS * one by one *: 1. take the head job and submit it on this ring 2. see if it completes, if not then this job is the real "bad job" 3. take it away from mirror list if this head job is "bad job" * After above iteration on all RINGS, we already clears all the bad job(s) 1. Resubmit all jobs from each mirror list to their corresponding rings (this is the existed logic) The idea of this is to use "serial" way to re-run and re-check each head job of each RING, in order to take out the real black sheep and its guilty context. P.S.: we can use this approaches only on GFX/KCQ ring reports TDR , since those rings are intermutually affected to each other. For SDMA ring timeout it definitely proves the head job on SDMA ring is really guilty. Thanks -- Monk Liu | Cloud-GPU Core team -- ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] amdgpu/pm: read_sensor() report failure apporpriately
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Evan Quan -Original Message- From: amd-gfx On Behalf Of Shirish S Sent: Thursday, February 25, 2021 11:45 PM To: Deucher, Alexander ; amd-gfx@lists.freedesktop.org Cc: S, Shirish Subject: [PATCH] amdgpu/pm: read_sensor() report failure apporpriately report -ENOTSUPP instead of -EINVAL, so that if userspace fails to read sensor data can figure it out the failure correctly. Signed-off-by: Shirish S --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/kv_dpm.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/si_dpm.c | 2 +- 8 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c index 2c90f715ee0a..c932b632ddd4 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c @@ -1285,7 +1285,7 @@ static int smu10_read_sensor(struct pp_hwmgr *hwmgr, int idx, *size = 4; break; default: - ret = -EINVAL; + ret = -EOPNOTSUPP; break; } diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c index c57dc9ae81f2..a58f92249c53 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c @@ -3945,7 +3945,7 @@ static int smu7_read_sensor(struct pp_hwmgr *hwmgr, int idx, *((uint32_t *)value) = (uint32_t)convert_to_vddc(val_vid); return 0; default: - return -EINVAL; + return -EOPNOTSUPP; } } diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c index ed9b89980184..2cef9c0c6d6f 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c @@ -1805,7 +1805,7 @@ static int smu8_read_sensor(struct pp_hwmgr *hwmgr, int idx, *((uint32_t *)value) = smu8_thermal_get_temperature(hwmgr); return 0; default: - return -EINVAL; + return -EOPNOTSUPP; } } diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c index 29c99642d22d..5e875ad8d633 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c @@ -3890,7 +3890,7 @@ static int vega10_read_sensor(struct pp_hwmgr *hwmgr, int idx, *size = 8; break; default: - ret = -EINVAL; + ret = -EOPNOTSUPP; break; } diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c index c0753029a8e2..a827f2bc7904 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c @@ -1429,7 +1429,7 @@ static int vega12_read_sensor(struct pp_hwmgr *hwmgr, int idx, *size = 8; break; default: - ret = -EINVAL; + ret = -EOPNOTSUPP; break; } return ret; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c index 87811b005b85..e8eec2539c17 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c @@ -2240,7 +2240,7 @@ static int vega20_read_sensor(struct pp_hwmgr *hwmgr, int idx, *size = 8; break; default: - ret = -EINVAL; + ret = -EOPNOTSUPP; break; } return ret; diff --git a/drivers/gpu/drm/amd/pm/powerplay/kv_dpm.c b/drivers/gpu/drm/amd/pm/powerplay/kv_dpm.c index 66daabebee35..bcae42cef374 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/kv_dpm.c +++ b/drivers/gpu/drm/amd/pm/powerplay/kv_dpm.c @@ -3305,7 +3305,7 @@ static int kv_dpm_read_sensor(void *handle, int idx, *size = 4; return 0; default: - return -EINVAL; + return -EOPNOTSUPP; } } diff --git a/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c b/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c index 62291358fb1c..26a5321e621b 100644 --- a/drivers/gpu/drm/amd/pm/po
RE: [PATCH] drm/amdgpu: enable one vf mode on navi21 vf
[AMD Public Use] Acked-by: Evan Quan -Original Message- From: Horace Chen Sent: Thursday, February 25, 2021 8:05 PM To: amd-gfx@lists.freedesktop.org Cc: Grodzovsky, Andrey ; Quan, Evan ; Chen, Horace ; Tuikov, Luben ; Koenig, Christian ; Deucher, Alexander ; Xiao, Jack ; Zhang, Hawking ; Liu, Monk ; Xu, Feifei ; Wang, Kevin(Yang) ; Xiaojie Yuan Subject: [PATCH] drm/amdgpu: enable one vf mode on navi21 vf navi21 vf needs one vf mode which allows vf to set and get clock status from guest vm. So now expose the required interface and allow some smu request on VF mode. Also since navi21 blocked direct MMIO access, use KIQ to send SMU request under sriov vf. OD use same command as getting pp table which is not allowed for navi21, so remove OD feature under sriov vf. Signed-off-by: Horace Chen --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++ drivers/gpu/drm/amd/pm/amdgpu_pm.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c| 10 ++ .../gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 10 +- drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 12 ++-- 5 files changed, 20 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f0f7ed42ee7f..dfbf2f2db0de 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2043,6 +2043,8 @@ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev) adev->pm.pp_feature = amdgpu_pp_feature_mask; if (amdgpu_sriov_vf(adev) || sched_policy == KFD_SCHED_POLICY_NO_HWS) adev->pm.pp_feature &= ~PP_GFXOFF_MASK; + if (amdgpu_sriov_vf(adev) && adev->asic_type == CHIP_SIENNA_CICHLID) + adev->pm.pp_feature &= ~PP_OVERDRIVE_MASK; for (i = 0; i < adev->num_ip_blocks; i++) { if ((amdgpu_ip_block_mask & (1 << i)) == 0) { diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index b770dd634ab6..1866cbaf70c3 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -2167,7 +2167,7 @@ static ssize_t amdgpu_get_gpu_metrics(struct device *dev, static struct amdgpu_device_attr amdgpu_device_attrs[] = { AMDGPU_DEVICE_ATTR_RW(power_dpm_state, ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF), - AMDGPU_DEVICE_ATTR_RW(power_dpm_force_performance_level, ATTR_FLAG_BASIC), + AMDGPU_DEVICE_ATTR_RW(power_dpm_force_performance_level, ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF), AMDGPU_DEVICE_ATTR_RO(pp_num_states, ATTR_FLAG_BASIC), AMDGPU_DEVICE_ATTR_RO(pp_cur_state, ATTR_FLAG_BASIC), AMDGPU_DEVICE_ATTR_RW(pp_force_state, ATTR_FLAG_BASIC), diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index d143ef1b460b..7033d52eb4d0 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -612,10 +612,12 @@ static int smu_late_init(void *handle) return ret; } - ret = smu_set_default_od_settings(smu); - if (ret) { - dev_err(adev->dev, "Failed to setup default OD settings!\n"); - return ret; + if (smu->od_enabled) { + ret = smu_set_default_od_settings(smu); + if (ret) { + dev_err(adev->dev, "Failed to setup default OD settings!\n"); + return ret; + } } ret = smu_populate_umd_state_clk(smu); diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c index af73e1430af5..441effc23625 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c @@ -89,17 +89,17 @@ static struct cmn2asic_msg_mapping sienna_cichlid_message_map[SMU_MSG_MAX_COUNT] MSG_MAP(GetEnabledSmuFeaturesHigh, PPSMC_MSG_GetRunningSmuFeaturesHigh, 1), MSG_MAP(SetWorkloadMask,PPSMC_MSG_SetWorkloadMask, 1), MSG_MAP(SetPptLimit,PPSMC_MSG_SetPptLimit, 0), - MSG_MAP(SetDriverDramAddrHigh, PPSMC_MSG_SetDriverDramAddrHigh, 0), - MSG_MAP(SetDriverDramAddrLow, PPSMC_MSG_SetDriverDramAddrLow, 0), + MSG_MAP(SetDriverDramAddrHigh, PPSMC_MSG_SetDriverDramAddrHigh, 1), + MSG_MAP(SetDriverDramAddrLow, PPSMC_MSG_SetDriverDramAddrLow, 1), MSG_MAP(SetToolsDramAddrHigh, PPSMC_MSG_SetToolsDramAddrHigh, 0), MSG_MAP(SetToolsDramAddrLow,PPSMC_MSG_SetToolsDramAddrLow, 0), - MSG_MAP(TransferTableSmu2Dram,
RE: [PATCH] drm/amdgpu: remove unnecessary reading for epprom header
Hi, Hawking, Agree with your suggestion, and it could further simplify our codes. I will refactor them again. Best Regards Dennis Li -Original Message- From: Zhang, Hawking Sent: Friday, February 26, 2021 12:30 PM To: Li, Dennis ; amd-gfx@lists.freedesktop.org; Chen, Guchun ; Koenig, Christian Cc: Li, Dennis Subject: RE: [PATCH] drm/amdgpu: remove unnecessary reading for epprom header [AMD Public Use] What about merge this function with amdgpu_ras_check_err_threshold? Regards, Hawking -Original Message- From: Dennis Li Sent: Friday, February 26, 2021 09:26 To: amd-gfx@lists.freedesktop.org; Chen, Guchun ; Zhang, Hawking ; Koenig, Christian Cc: Li, Dennis Subject: [PATCH] drm/amdgpu: remove unnecessary reading for epprom header If the number of badpage records exceed the threshold, driver has updated both epprom header and control->tbl_hdr.header before gpu reset, therefore GPU recovery thread no need to read epprom header directly. Signed-off-by: Dennis Li diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index 19d9aa76cfbf..4310ad63890c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c @@ -439,41 +439,19 @@ int amdgpu_ras_eeprom_check_err_threshold( bool *exceed_err_limit) { struct amdgpu_device *adev = to_amdgpu_device(control); - unsigned char buff[EEPROM_ADDRESS_SIZE + - EEPROM_TABLE_HEADER_SIZE] = { 0 }; - struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr; - struct i2c_msg msg = { - .addr = control->i2c_address, - .flags = I2C_M_RD, - .len = EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE, - .buf = buff, - }; - int ret; *exceed_err_limit = false; if (!__is_ras_eeprom_supported(adev)) return 0; - /* read EEPROM table header */ - mutex_lock(&control->tbl_mutex); - ret = i2c_transfer(&adev->pm.smu_i2c, &msg, 1); - if (ret < 1) { - dev_err(adev->dev, "Failed to read EEPROM table header.\n"); - goto err; - } - - __decode_table_header_from_buff(hdr, &buff[2]); - - if (hdr->header == EEPROM_TABLE_HDR_BAD) { + if (control->tbl_hdr.header == EEPROM_TABLE_HDR_BAD) { dev_warn(adev->dev, "This GPU is in BAD status."); dev_warn(adev->dev, "Please retire it or setting one bigger " "threshold value when reloading driver.\n"); *exceed_err_limit = true; } -err: - mutex_unlock(&control->tbl_mutex); return 0; } -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: remove unnecessary reading for epprom header
[AMD Public Use] What about merge this function with amdgpu_ras_check_err_threshold? Regards, Hawking -Original Message- From: Dennis Li Sent: Friday, February 26, 2021 09:26 To: amd-gfx@lists.freedesktop.org; Chen, Guchun ; Zhang, Hawking ; Koenig, Christian Cc: Li, Dennis Subject: [PATCH] drm/amdgpu: remove unnecessary reading for epprom header If the number of badpage records exceed the threshold, driver has updated both epprom header and control->tbl_hdr.header before gpu reset, therefore GPU recovery thread no need to read epprom header directly. Signed-off-by: Dennis Li diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index 19d9aa76cfbf..4310ad63890c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c @@ -439,41 +439,19 @@ int amdgpu_ras_eeprom_check_err_threshold( bool *exceed_err_limit) { struct amdgpu_device *adev = to_amdgpu_device(control); - unsigned char buff[EEPROM_ADDRESS_SIZE + - EEPROM_TABLE_HEADER_SIZE] = { 0 }; - struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr; - struct i2c_msg msg = { - .addr = control->i2c_address, - .flags = I2C_M_RD, - .len = EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE, - .buf = buff, - }; - int ret; *exceed_err_limit = false; if (!__is_ras_eeprom_supported(adev)) return 0; - /* read EEPROM table header */ - mutex_lock(&control->tbl_mutex); - ret = i2c_transfer(&adev->pm.smu_i2c, &msg, 1); - if (ret < 1) { - dev_err(adev->dev, "Failed to read EEPROM table header.\n"); - goto err; - } - - __decode_table_header_from_buff(hdr, &buff[2]); - - if (hdr->header == EEPROM_TABLE_HDR_BAD) { + if (control->tbl_hdr.header == EEPROM_TABLE_HDR_BAD) { dev_warn(adev->dev, "This GPU is in BAD status."); dev_warn(adev->dev, "Please retire it or setting one bigger " "threshold value when reloading driver.\n"); *exceed_err_limit = true; } -err: - mutex_unlock(&control->tbl_mutex); return 0; } -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: remove unused variable in amdgpu_dma_buf_unmap()
[AMD Public Use] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Kevin Wang Sent: Friday, February 26, 2021 12:25 To: amd-gfx@lists.freedesktop.org Cc: Wang, Kevin(Yang) Subject: [PATCH] drm/amdgpu: remove unused variable in amdgpu_dma_buf_unmap() clean up unsued variable in amdgpu_dma_buf_unmap(). Fixes: drm/amdgpu: Remove amdgpu_device arg from free_sgt api Signed-off-by: Kevin Wang --- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index e83d73ec0e9d..5ec6556ebf7a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -356,10 +356,6 @@ static void amdgpu_dma_buf_unmap(struct dma_buf_attachment *attach, struct sg_table *sgt, enum dma_data_direction dir) { - struct dma_buf *dma_buf = attach->dmabuf; - struct drm_gem_object *obj = dma_buf->priv; - struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); - if (sgt->sgl->page_link) { dma_unmap_sgtable(attach->dev, sgt, dir, 0); sg_free_table(sgt); -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Chawking.zhang%40amd.com%7C9555021bae2f415e1bef08d8da0e8125%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637499103125405709%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=y21rsObIUGSeS2XiFfVMrUNuD5%2Fc%2BrfXjOd%2BamdLa60%3D&reserved=0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: add RAP TA version print in amdgpu_firmware_info
[AMD Public Use] Reviewed-by: Hawking Zhang -Original Message- From: Wang, Kevin(Yang) Sent: Friday, February 26, 2021 12:24 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Li, Candice ; Wang, Kevin(Yang) Subject: [PATCH] drm/amdgpu: add RAP TA version print in amdgpu_firmware_info add RAP TA version print in amdgpu_firmware_info. Signed-off-by: Kevin Wang --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index ce031a77cda5..a5ed9530f542 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c @@ -305,6 +305,10 @@ static int amdgpu_firmware_info(struct drm_amdgpu_info_firmware *fw_info, fw_info->ver = adev->psp.ta_fw_version; fw_info->feature = adev->psp.ta_dtm_ucode_version; break; + case 4: + fw_info->ver = adev->psp.ta_fw_version; + fw_info->feature = adev->psp.ta_rap_ucode_version; + break; default: return -EINVAL; } @@ -1490,6 +1494,10 @@ static int amdgpu_debugfs_firmware_info_show(struct seq_file *m, void *unused) seq_printf(m, "TA %s feature version: 0x%08x, firmware version: 0x%08x\n", "DTM", fw_info.feature, fw_info.ver); break; + case 4: + seq_printf(m, "TA %s feature version: 0x%08x, firmware version: 0x%08x\n", + "RAP", fw_info.feature, fw_info.ver); + break; default: return -EINVAL; } -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: remove unused variable in amdgpu_dma_buf_unmap()
clean up unsued variable in amdgpu_dma_buf_unmap(). Fixes: drm/amdgpu: Remove amdgpu_device arg from free_sgt api Signed-off-by: Kevin Wang --- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index e83d73ec0e9d..5ec6556ebf7a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -356,10 +356,6 @@ static void amdgpu_dma_buf_unmap(struct dma_buf_attachment *attach, struct sg_table *sgt, enum dma_data_direction dir) { - struct dma_buf *dma_buf = attach->dmabuf; - struct drm_gem_object *obj = dma_buf->priv; - struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); - if (sgt->sgl->page_link) { dma_unmap_sgtable(attach->dev, sgt, dir, 0); sg_free_table(sgt); -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: add RAP TA version print in amdgpu_firmware_info
add RAP TA version print in amdgpu_firmware_info. Signed-off-by: Kevin Wang --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index ce031a77cda5..a5ed9530f542 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c @@ -305,6 +305,10 @@ static int amdgpu_firmware_info(struct drm_amdgpu_info_firmware *fw_info, fw_info->ver = adev->psp.ta_fw_version; fw_info->feature = adev->psp.ta_dtm_ucode_version; break; + case 4: + fw_info->ver = adev->psp.ta_fw_version; + fw_info->feature = adev->psp.ta_rap_ucode_version; + break; default: return -EINVAL; } @@ -1490,6 +1494,10 @@ static int amdgpu_debugfs_firmware_info_show(struct seq_file *m, void *unused) seq_printf(m, "TA %s feature version: 0x%08x, firmware version: 0x%08x\n", "DTM", fw_info.feature, fw_info.ver); break; + case 4: + seq_printf(m, "TA %s feature version: 0x%08x, firmware version: 0x%08x\n", + "RAP", fw_info.feature, fw_info.ver); + break; default: return -EINVAL; } -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: remove unnecessary reading for epprom header
If the number of badpage records exceed the threshold, driver has updated both epprom header and control->tbl_hdr.header before gpu reset, therefore GPU recovery thread no need to read epprom header directly. Signed-off-by: Dennis Li diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index 19d9aa76cfbf..4310ad63890c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c @@ -439,41 +439,19 @@ int amdgpu_ras_eeprom_check_err_threshold( bool *exceed_err_limit) { struct amdgpu_device *adev = to_amdgpu_device(control); - unsigned char buff[EEPROM_ADDRESS_SIZE + - EEPROM_TABLE_HEADER_SIZE] = { 0 }; - struct amdgpu_ras_eeprom_table_header *hdr = &control->tbl_hdr; - struct i2c_msg msg = { - .addr = control->i2c_address, - .flags = I2C_M_RD, - .len = EEPROM_ADDRESS_SIZE + EEPROM_TABLE_HEADER_SIZE, - .buf = buff, - }; - int ret; *exceed_err_limit = false; if (!__is_ras_eeprom_supported(adev)) return 0; - /* read EEPROM table header */ - mutex_lock(&control->tbl_mutex); - ret = i2c_transfer(&adev->pm.smu_i2c, &msg, 1); - if (ret < 1) { - dev_err(adev->dev, "Failed to read EEPROM table header.\n"); - goto err; - } - - __decode_table_header_from_buff(hdr, &buff[2]); - - if (hdr->header == EEPROM_TABLE_HDR_BAD) { + if (control->tbl_hdr.header == EEPROM_TABLE_HDR_BAD) { dev_warn(adev->dev, "This GPU is in BAD status."); dev_warn(adev->dev, "Please retire it or setting one bigger " "threshold value when reloading driver.\n"); *exceed_err_limit = true; } -err: - mutex_unlock(&control->tbl_mutex); return 0; } -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: remove unused variables
Not used so remove them. Fixes: d2d0c920a127 ("drm/amdgpu: Remove amdgpu_device arg from free_sgt api") Cc: Ramesh Errabolu Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index e83d73ec0e9d..5ec6556ebf7a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -356,10 +356,6 @@ static void amdgpu_dma_buf_unmap(struct dma_buf_attachment *attach, struct sg_table *sgt, enum dma_data_direction dir) { - struct dma_buf *dma_buf = attach->dmabuf; - struct drm_gem_object *obj = dma_buf->priv; - struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); - if (sgt->sgl->page_link) { dma_unmap_sgtable(attach->dev, sgt, dir, 0); sg_free_table(sgt); -- 2.29.2 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/display: Fix an uninitialized index variable
On Thu, Feb 25, 2021 at 10:01 AM Arnd Bergmann wrote: > > From: Arnd Bergmann > > clang points out that the new logic uses an always-uninitialized > array index: > > drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:9810:38: warning: > variable 'i' is uninitialized when used here [-Wuninitialized] > timing = &edid->detailed_timings[i]; > ^ > drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:9720:7: note: > initialize the variable 'i' to silence this warning > > My best guess is that the index should have been returned by the > parse_hdmi_amd_vsdb() function that walks an array here, so do that. > > Fixes: f9b4f20c4777 ("drm/amd/display: Add Freesync HDMI support to DM") > Signed-off-by: Arnd Bergmann This looks correct to me. Stylon can you verify? Thanks, Alex > --- > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c| 16 > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index b19b93c74bae..667c0d52dbfa 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -9736,7 +9736,7 @@ static bool parse_edid_cea(struct amdgpu_dm_connector > *aconnector, > return false; > } > > -static bool parse_hdmi_amd_vsdb(struct amdgpu_dm_connector *aconnector, > +static int parse_hdmi_amd_vsdb(struct amdgpu_dm_connector *aconnector, > struct edid *edid, struct amdgpu_hdmi_vsdb_info *vsdb_info) > { > uint8_t *edid_ext = NULL; > @@ -9746,7 +9746,7 @@ static bool parse_hdmi_amd_vsdb(struct > amdgpu_dm_connector *aconnector, > /*- drm_find_cea_extension() -*/ > /* No EDID or EDID extensions */ > if (edid == NULL || edid->extensions == 0) > - return false; > + return -ENODEV; > > /* Find CEA extension */ > for (i = 0; i < edid->extensions; i++) { > @@ -9756,14 +9756,15 @@ static bool parse_hdmi_amd_vsdb(struct > amdgpu_dm_connector *aconnector, > } > > if (i == edid->extensions) > - return false; > + return -ENODEV; > > /*- cea_db_offsets() -*/ > if (edid_ext[0] != CEA_EXT) > - return false; > + return -ENODEV; > > valid_vsdb_found = parse_edid_cea(aconnector, edid_ext, EDID_LENGTH, > vsdb_info); > - return valid_vsdb_found; > + > + return valid_vsdb_found ? i : -ENODEV; > } > > void amdgpu_dm_update_freesync_caps(struct drm_connector *connector, > @@ -9781,7 +9782,6 @@ void amdgpu_dm_update_freesync_caps(struct > drm_connector *connector, > struct amdgpu_device *adev = drm_to_adev(dev); > bool freesync_capable = false; > struct amdgpu_hdmi_vsdb_info vsdb_info = {0}; > - bool hdmi_valid_vsdb_found = false; > > if (!connector->state) { > DRM_ERROR("%s - Connector has no state", __func__); > @@ -9857,8 +9857,8 @@ void amdgpu_dm_update_freesync_caps(struct > drm_connector *connector, > } > } > } else if (edid && amdgpu_dm_connector->dc_sink->sink_signal == > SIGNAL_TYPE_HDMI_TYPE_A) { > - hdmi_valid_vsdb_found = > parse_hdmi_amd_vsdb(amdgpu_dm_connector, edid, &vsdb_info); > - if (hdmi_valid_vsdb_found && vsdb_info.freesync_supported) { > + i = parse_hdmi_amd_vsdb(amdgpu_dm_connector, edid, > &vsdb_info); > + if (i >= 0 && vsdb_info.freesync_supported) { > timing = &edid->detailed_timings[i]; > data= &timing->data.other_data; > > -- > 2.29.2 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/display: Fix an uninitialized index variable
On Thu, Feb 25, 2021 at 10:34 PM 'Nick Desaulniers' via Clang Built Linux wrote: > return parse_edid_cea(aconnector, edid_ext, EDID_LENGTH, vsdb_info) ? i : > -ENODEV; > > would suffice, but the patch is still fine as is. Right, I did not want to change more than necessary here, and the original code already had the extra assignment instead of returning the value. > > @@ -9857,8 +9857,8 @@ void amdgpu_dm_update_freesync_caps(struct > > drm_connector *connector, > > } > > } > > } else if (edid && amdgpu_dm_connector->dc_sink->sink_signal == > > SIGNAL_TYPE_HDMI_TYPE_A) { > > - hdmi_valid_vsdb_found = > > parse_hdmi_amd_vsdb(amdgpu_dm_connector, edid, &vsdb_info); > > - if (hdmi_valid_vsdb_found && vsdb_info.freesync_supported) { > > + i = parse_hdmi_amd_vsdb(amdgpu_dm_connector, edid, > > &vsdb_info); > > + if (i >= 0 && vsdb_info.freesync_supported) { > > reusing `i` here is safe, for now, but reuse of variables like this in > separate branches like this might not get noticed if the function is > amended in the future. > > > timing = &edid->detailed_timings[i]; > > data= &timing->data.other_data; The entire point of the patch is that 'i' is in fact used in the following line, but was lacking an intialization. Arnd ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/display: fix 64-bit integer division
[AMD Official Use Only - Internal Distribution Only] Hi Arnd, I have all the patches ready and I have tested them with the feature/platform I'm working on and Bindu helped to test the 32bit build. I'm in process of submitting the latest change. Thanks, Vladimir. On 2021-02-25, 4:36 PM, "Arnd Bergmann" wrote: On Thu, Feb 25, 2021 at 3:33 PM Arnd Bergmann wrote: > > From: Arnd Bergmann > > The new display synchronization code caused a regression > on all 32-bit architectures: > > ld.lld: error: undefined symbol: __aeabi_uldivmod > >>> referenced by dce_clock_source.c > >>> gpu/drm/amd/display/dc/dce/dce_clock_source.o:(get_pixel_clk_frequency_100hz) in archive drivers/built-in.a > > ld.lld: error: undefined symbol: __aeabi_ldivmod > >>> referenced by dc_resource.c > >>> gpu/drm/amd/display/dc/core/dc_resource.o:(resource_are_vblanks_synchronizable) in archive drivers/built-in.a > >>> referenced by dc_resource.c > >>> gpu/drm/amd/display/dc/core/dc_resource.o:(resource_are_vblanks_synchronizable) in archive drivers/built-in.a > >>> referenced by dc_resource.c > >>> gpu/drm/amd/display/dc/core/dc_resource.o:(resource_are_vblanks_synchronizable) in archive drivers/built-in.a > > This is not a fast path, so the use of an explicit div_u64/div_s64 > seems appropriate. I found two more instances: >>> referenced by dcn20_optc.c >>> gpu/drm/amd/display/dc/dcn20/dcn20_optc.o:(optc2_align_vblanks) in archive drivers/built-in.a >>> referenced by dcn10_hw_sequencer.c >>> gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.o:(reduceSizeAndFraction) in archive drivers/built-in.a I have patches for both, but will let the randconfig build box keep working on it over night to see if there are any others. Let me know if you want a combined patch or one per file once there are no more regressions. Arnd ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH v3] drm/scheduler: Fix hang when sched_entity released
Problem: If scheduler is already stopped by the time sched_entity is released and entity's job_queue not empty I encountred a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle never becomes false. Fix: In drm_sched_fini detach all sched_entities from the scheduler's run queues. This will satisfy drm_sched_entity_is_idle. Also wakeup all those processes stuck in sched_entity flushing as the scheduler main thread which wakes them up is stopped by now. v2: Reverse order of drm_sched_rq_remove_entity and marking s_entity as stopped to prevent reinserion back to rq due to race. v3: Drop drm_sched_rq_remove_entity, only modify entity->stopped and check for it in drm_sched_entity_is_idle Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/scheduler/sched_entity.c | 3 ++- drivers/gpu/drm/scheduler/sched_main.c | 23 +++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 92d965b629c6..68b10813129a 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -116,7 +116,8 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity) rmb(); /* for list_empty to work without lock */ if (list_empty(&entity->list) || - spsc_queue_count(&entity->job_queue) == 0) + spsc_queue_count(&entity->job_queue) == 0 || + entity->stopped) return true; return false; diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 908b0b56032d..b50fab472734 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -897,9 +897,32 @@ EXPORT_SYMBOL(drm_sched_init); */ void drm_sched_fini(struct drm_gpu_scheduler *sched) { + int i; + struct drm_sched_entity *s_entity; if (sched->thread) kthread_stop(sched->thread); + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { + struct drm_sched_rq *rq = &sched->sched_rq[i]; + + if (!rq) + continue; + + spin_lock(&rq->lock); + list_for_each_entry(s_entity, &rq->entities, list) + /* +* Prevents reinsertion and marks job_queue as idle, +* it will removed from rq in drm_sched_entity_fini +* eventually +*/ + s_entity->stopped = true; + spin_unlock(&rq->lock); + + } + + /* Wakeup everyone stuck in drm_sched_entity_flush for this scheduler */ + wake_up_all(&sched->job_scheduled); + /* Confirm no work left behind accessing device structures */ cancel_delayed_work_sync(&sched->work_tdr); -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/display: fix 64-bit integer division
On Thu, Feb 25, 2021 at 3:33 PM Arnd Bergmann wrote: > > From: Arnd Bergmann > > The new display synchronization code caused a regression > on all 32-bit architectures: > > ld.lld: error: undefined symbol: __aeabi_uldivmod > >>> referenced by dce_clock_source.c > >>> > >>> gpu/drm/amd/display/dc/dce/dce_clock_source.o:(get_pixel_clk_frequency_100hz) > >>> in archive drivers/built-in.a > > ld.lld: error: undefined symbol: __aeabi_ldivmod > >>> referenced by dc_resource.c > >>> > >>> gpu/drm/amd/display/dc/core/dc_resource.o:(resource_are_vblanks_synchronizable) > >>> in archive drivers/built-in.a > >>> referenced by dc_resource.c > >>> > >>> gpu/drm/amd/display/dc/core/dc_resource.o:(resource_are_vblanks_synchronizable) > >>> in archive drivers/built-in.a > >>> referenced by dc_resource.c > >>> > >>> gpu/drm/amd/display/dc/core/dc_resource.o:(resource_are_vblanks_synchronizable) > >>> in archive drivers/built-in.a > > This is not a fast path, so the use of an explicit div_u64/div_s64 > seems appropriate. I found two more instances: >>> referenced by dcn20_optc.c >>> >>> gpu/drm/amd/display/dc/dcn20/dcn20_optc.o:(optc2_align_vblanks) in archive >>> drivers/built-in.a >>> referenced by dcn10_hw_sequencer.c >>> >>> gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.o:(reduceSizeAndFraction) >>> in archive drivers/built-in.a I have patches for both, but will let the randconfig build box keep working on it over night to see if there are any others. Let me know if you want a combined patch or one per file once there are no more regressions. Arnd ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/display: remove unnecessary conversion to bool
On Thu, Feb 25, 2021 at 4:19 AM Jiapeng Chong wrote: > > Fix the following coccicheck warnings: > > ./drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp_cm.c:243:67-72: > WARNING: conversion to bool not needed here. > > Reported-by: Abaci Robot > Signed-off-by: Jiapeng Chong Applied. Thanks! Alex > --- > drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp_cm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp_cm.c > b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp_cm.c > index 3398540..102f6a0 100644 > --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp_cm.c > +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp_cm.c > @@ -240,7 +240,7 @@ bool dpp3_program_gamcor_lut( > next_mode = LUT_RAM_A; > > dpp3_power_on_gamcor_lut(dpp_base, true); > - dpp3_configure_gamcor_lut(dpp_base, next_mode == LUT_RAM_A ? > true:false); > + dpp3_configure_gamcor_lut(dpp_base, next_mode == LUT_RAM_A); > > if (next_mode == LUT_RAM_B) { > gam_regs.start_cntl_b = REG(CM_GAMCOR_RAMB_START_CNTL_B); > -- > 1.8.3.1 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: Replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE
On Thu, Feb 25, 2021 at 4:02 AM Yang Li wrote: > > Fix the following coccicheck warning: > ./drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1589:0-23: WARNING: > fops_ib_preempt should be defined with DEFINE_DEBUGFS_ATTRIBUTE > ./drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1592:0-23: WARNING: > fops_sclk_set should be defined with DEFINE_DEBUGFS_ATTRIBUTE > > Reported-by: Abaci Robot > Signed-off-by: Yang Li Applied. Thanks! Alex > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c > index 0a25fec..52ef488 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c > @@ -1586,10 +1586,10 @@ static int amdgpu_debugfs_sclk_set(void *data, u64 > val) > return 0; > } > > -DEFINE_SIMPLE_ATTRIBUTE(fops_ib_preempt, NULL, > +DEFINE_DEBUGFS_ATTRIBUTE(fops_ib_preempt, NULL, > amdgpu_debugfs_ib_preempt, "%llu\n"); > > -DEFINE_SIMPLE_ATTRIBUTE(fops_sclk_set, NULL, > +DEFINE_DEBUGFS_ATTRIBUTE(fops_sclk_set, NULL, > amdgpu_debugfs_sclk_set, "%llu\n"); > > int amdgpu_debugfs_init(struct amdgpu_device *adev) > -- > 1.8.3.1 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v2] drm/scheduler: Fix hang when sched_entity released
On 2021-02-25 1:42 p.m., Christian König wrote: Am 25.02.21 um 17:03 schrieb Andrey Grodzovsky: On 2021-02-25 2:53 a.m., Christian König wrote: Am 24.02.21 um 16:13 schrieb Andrey Grodzovsky: Ping Sorry, I've been on vacation this week. Andrey On 2021-02-20 7:12 a.m., Andrey Grodzovsky wrote: On 2/20/21 3:38 AM, Christian König wrote: Am 18.02.21 um 17:41 schrieb Andrey Grodzovsky: On 2/18/21 10:15 AM, Christian König wrote: Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: On 2/18/21 3:07 AM, Christian König wrote: Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: Problem: If scheduler is already stopped by the time sched_entity is released and entity's job_queue not empty I encountred a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle never becomes false. Fix: In drm_sched_fini detach all sched_entities from the scheduler's run queues. This will satisfy drm_sched_entity_is_idle. Also wakeup all those processes stuck in sched_entity flushing as the scheduler main thread which wakes them up is stopped by now. v2: Reverse order of drm_sched_rq_remove_entity and marking s_entity as stopped to prevent reinserion back to rq due to race. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/scheduler/sched_main.c | 31 +++ 1 file changed, 31 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 908b0b5..c6b7947 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); */ void drm_sched_fini(struct drm_gpu_scheduler *sched) { + int i; + struct drm_sched_entity *s_entity; BTW: Please order that so that i is declared last. if (sched->thread) kthread_stop(sched->thread); + /* Detach all sched_entites from this scheduler once it's stopped */ + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { + struct drm_sched_rq *rq = &sched->sched_rq[i]; + + if (!rq) + continue; + + /* Loop this way because rq->lock is taken in drm_sched_rq_remove_entity */ + spin_lock(&rq->lock); + while ((s_entity = list_first_entry_or_null(&rq->entities, + struct drm_sched_entity, + list))) { + spin_unlock(&rq->lock); + + /* Prevent reinsertion and remove */ + spin_lock(&s_entity->rq_lock); + s_entity->stopped = true; + drm_sched_rq_remove_entity(rq, s_entity); + spin_unlock(&s_entity->rq_lock); Well this spin_unlock/lock dance here doesn't look correct at all now. Christian. In what way ? It's in the same same order as in other call sites (see drm_sched_entity_push_job and drm_sched_entity_flush). If i just locked rq->lock and did list_for_each_entry_safe while manually deleting entity->list instead of calling drm_sched_rq_remove_entity this still would not be possible as the order of lock acquisition between s_entity->rq_lock and rq->lock would be reverse compared to the call sites mentioned above. Ah, now I understand. You need this because drm_sched_rq_remove_entity() will grab the rq lock again! Problem is now what prevents the entity from being destroyed while you remove it? Christian. Right, well, since (unfortunately) sched_entity is part of amdgpu_ctx_entity and amdgpu_ctx_entity is refcounted there is a problem here that we don't increment amdgpu_ctx.refcount when assigning sched_entity to new rq (e.g. before drm_sched_rq_add_entity) and not decrement before removing. We do it for amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init and amdgpu_cs_parser_fini by calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a bit tricky due to all the drm_sched_entity_select_rq logic. Another, kind of a band aid fix, would probably be just locking amdgpu_ctx_mgr.lock around drm_sched_fini when finalizing the fence driver and around idr iteration in amdgpu_ctx_mgr_fini (which should be lock protected anyway as I see from other idr usages in the code) ... This should prevent this use after free. Puh, that's rather complicated as well. Ok let's look at it from the other side for a moment. Why do we have to remove the entities from the rq in the first place? Wouldn't it be sufficient to just set all of them to stopped? Christian. And adding it as another condition in drm_sched_entity_is_idle ? Yes, I think that should work. In this case reverse locking order is created, In drm_sched_entity_push_job and drm_sched_entity_flush lock entity->rq_lock locked first and rq->lock locked second. In drm_sched_fini on the other hand, I need to lock rq->lock first to iterate rq->entities and then lock s_entity->rq_lock for each entity to modify s_entity->stopped. I guess we could change s_entity->stopped to atomic ? Good point.
Re: [PATCH] drm/amdgpu: add missing df counter disable write
On Tue, Feb 23, 2021 at 4:34 PM Jonathan Kim wrote: > > Request to stop DF performance counters is missing the actual write to the > controller register. > > Reported-by: Chris Freehill > Signed-off-by: Jonathan Kim Acked-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/df_v3_6.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c > b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c > index 6b4b30a8dce5..44109a6b8f44 100644 > --- a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c > +++ b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c > @@ -568,6 +568,8 @@ static int df_v3_6_pmc_stop(struct amdgpu_device *adev, > uint64_t config, > if (ret) > return ret; > > + df_v3_6_perfmon_wreg(adev, lo_base_addr, lo_val, > + hi_base_addr, hi_val); > > if (is_remove) { > df_v3_6_reset_perfmon_cntr(adev, config, counter_idx); > -- > 2.17.1 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/5] drm/amdgpu: add asic callback for querying video codec info (v3)
Am 25.02.21 um 21:16 schrieb Alex Deucher: This will be used by a new INFO ioctl query to fetch the decode and encode capabilities from the kernel driver rather than hardcoding them in mesa. This gives us more fine grained control of capabilities using information that is only availabl in the kernel (e.g., platform limitations or bandwidth restrictions). v2: reorder the codecs to better align with mesa v3: add max_pixels_per_frame to handle the portrait case Reviewed-by: Leo Liu (v2) Signed-off-by: Alex Deucher Reviewed-by: Christian König for the series. --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 26 ++ 1 file changed, 26 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 22e5d9f284c3..09aec16c8feb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -583,6 +583,28 @@ enum amd_reset_method { AMD_RESET_METHOD_PCI, }; +#define AMDGPU_VIDEO_CODEC_TYPE_MPEG2 0 +#define AMDGPU_VIDEO_CODEC_TYPE_MPEG4 1 +#define AMDGPU_VIDEO_CODEC_TYPE_VC12 +#define AMDGPU_VIDEO_CODEC_TYPE_MPEG4_AVC 3 +#define AMDGPU_VIDEO_CODEC_TYPE_HEVC 4 +#define AMDGPU_VIDEO_CODEC_TYPE_JPEG 5 +#define AMDGPU_VIDEO_CODEC_TYPE_VP96 +#define AMDGPU_VIDEO_CODEC_TYPE_AV17 + +struct amdgpu_video_codec_info { + u32 codec_type; + u32 max_width; + u32 max_height; + u32 max_pixels_per_frame; + u32 max_level; +}; + +struct amdgpu_video_codecs { + const u32 codec_count; + const struct amdgpu_video_codec_info *codec_array; +}; + /* * ASIC specific functions. */ @@ -627,6 +649,9 @@ struct amdgpu_asic_funcs { void (*pre_asic_init)(struct amdgpu_device *adev); /* enter/exit umd stable pstate */ int (*update_umd_stable_pstate)(struct amdgpu_device *adev, bool enter); + /* query video codecs */ + int (*query_video_codecs)(struct amdgpu_device *adev, bool encode, + const struct amdgpu_video_codecs **codecs); }; /* @@ -1221,6 +1246,7 @@ int emu_soc_asic_init(struct amdgpu_device *adev); #define amdgpu_asic_pre_asic_init(adev) (adev)->asic_funcs->pre_asic_init((adev)) #define amdgpu_asic_update_umd_stable_pstate(adev, enter) \ ((adev)->asic_funcs->update_umd_stable_pstate ? (adev)->asic_funcs->update_umd_stable_pstate((adev), (enter)) : 0) +#define amdgpu_asic_query_video_codecs(adev, e, c) (adev)->asic_funcs->query_video_codecs((adev), (e), (c)) #define amdgpu_inc_vram_lost(adev) atomic_inc(&((adev)->vram_lost_counter)); ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 5/5] drm/amdgpu/codec: drop the internal codec index
And just use the ioctl index. They are the same. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 9 - drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 16 - drivers/gpu/drm/amd/amdgpu/cik.c| 12 --- drivers/gpu/drm/amd/amdgpu/nv.c | 36 ++- drivers/gpu/drm/amd/amdgpu/si.c | 12 --- drivers/gpu/drm/amd/amdgpu/soc15.c | 46 + drivers/gpu/drm/amd/amdgpu/vi.c | 28 --- 7 files changed, 80 insertions(+), 79 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 09aec16c8feb..9ec99a2df666 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -583,15 +583,6 @@ enum amd_reset_method { AMD_RESET_METHOD_PCI, }; -#define AMDGPU_VIDEO_CODEC_TYPE_MPEG2 0 -#define AMDGPU_VIDEO_CODEC_TYPE_MPEG4 1 -#define AMDGPU_VIDEO_CODEC_TYPE_VC12 -#define AMDGPU_VIDEO_CODEC_TYPE_MPEG4_AVC 3 -#define AMDGPU_VIDEO_CODEC_TYPE_HEVC 4 -#define AMDGPU_VIDEO_CODEC_TYPE_JPEG 5 -#define AMDGPU_VIDEO_CODEC_TYPE_VP96 -#define AMDGPU_VIDEO_CODEC_TYPE_AV17 - struct amdgpu_video_codec_info { u32 codec_type; u32 max_width; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index 9f35e8a6c421..a5ed84bc83f7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c @@ -1012,14 +1012,14 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) int idx = codecs->codec_array[i].codec_type; switch (idx) { - case AMDGPU_VIDEO_CODEC_TYPE_MPEG2: - case AMDGPU_VIDEO_CODEC_TYPE_MPEG4: - case AMDGPU_VIDEO_CODEC_TYPE_MPEG4_AVC: - case AMDGPU_VIDEO_CODEC_TYPE_VC1: - case AMDGPU_VIDEO_CODEC_TYPE_HEVC: - case AMDGPU_VIDEO_CODEC_TYPE_JPEG: - case AMDGPU_VIDEO_CODEC_TYPE_VP9: - case AMDGPU_VIDEO_CODEC_TYPE_AV1: + case AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG2: + case AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG4: + case AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_VC1: + case AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG4_AVC: + case AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_HEVC: + case AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_JPEG: + case AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_VP9: + case AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_AV1: caps->codec_info[idx].valid = 1; caps->codec_info[idx].max_width = codecs->codec_array[i].max_width; diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c index 72abfad2fd67..c0fcc41ee574 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik.c +++ b/drivers/gpu/drm/amd/amdgpu/cik.c @@ -26,6 +26,8 @@ #include #include +#include + #include "amdgpu.h" #include "amdgpu_atombios.h" #include "amdgpu_ih.h" @@ -73,7 +75,7 @@ static const struct amdgpu_video_codec_info cik_video_codecs_encode_array[] = { { - .codec_type = AMDGPU_VIDEO_CODEC_TYPE_MPEG4_AVC, + .codec_type = AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG4_AVC, .max_width = 2048, .max_height = 1152, .max_pixels_per_frame = 2048 * 1152, @@ -90,28 +92,28 @@ static const struct amdgpu_video_codecs cik_video_codecs_encode = static const struct amdgpu_video_codec_info cik_video_codecs_decode_array[] = { { - .codec_type = AMDGPU_VIDEO_CODEC_TYPE_MPEG2, + .codec_type = AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG2, .max_width = 2048, .max_height = 1152, .max_pixels_per_frame = 2048 * 1152, .max_level = 3, }, { - .codec_type = AMDGPU_VIDEO_CODEC_TYPE_MPEG4, + .codec_type = AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG4, .max_width = 2048, .max_height = 1152, .max_pixels_per_frame = 2048 * 1152, .max_level = 5, }, { - .codec_type = AMDGPU_VIDEO_CODEC_TYPE_MPEG4_AVC, + .codec_type = AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG4_AVC, .max_width = 2048, .max_height = 1152, .max_pixels_per_frame = 2048 * 1152, .max_level = 41, }, { - .codec_type = AMDGPU_VIDEO_CODEC_TYPE_VC1, + .codec_type = AMDGPU_INFO_VIDEO_CAPS_CODEC_I
[PATCH 2/5] drm/amdgpu: add video decode/encode cap tables and asic callbacks (v3)
For each asic family. Will be used to populate tables for the new INFO ioctl query. v2: add max_pixels_per_frame to handle the portrait case v3: fix copy paste typos Reviewed-by: Leo Liu (v1) Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/cik.c | 75 ++ drivers/gpu/drm/amd/amdgpu/nv.c| 179 ++ drivers/gpu/drm/amd/amdgpu/si.c| 109 ++ drivers/gpu/drm/amd/amdgpu/soc15.c | 230 + drivers/gpu/drm/amd/amdgpu/vi.c| 188 +++ 5 files changed, 781 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c index 4d6832cc7fb0..72abfad2fd67 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik.c +++ b/drivers/gpu/drm/amd/amdgpu/cik.c @@ -70,6 +70,80 @@ #include "amdgpu_amdkfd.h" #include "dce_virtual.h" +static const struct amdgpu_video_codec_info cik_video_codecs_encode_array[] = +{ + { + .codec_type = AMDGPU_VIDEO_CODEC_TYPE_MPEG4_AVC, + .max_width = 2048, + .max_height = 1152, + .max_pixels_per_frame = 2048 * 1152, + .max_level = 0, + }, +}; + +static const struct amdgpu_video_codecs cik_video_codecs_encode = +{ + .codec_count = ARRAY_SIZE(cik_video_codecs_encode_array), + .codec_array = cik_video_codecs_encode_array, +}; + +static const struct amdgpu_video_codec_info cik_video_codecs_decode_array[] = +{ + { + .codec_type = AMDGPU_VIDEO_CODEC_TYPE_MPEG2, + .max_width = 2048, + .max_height = 1152, + .max_pixels_per_frame = 2048 * 1152, + .max_level = 3, + }, + { + .codec_type = AMDGPU_VIDEO_CODEC_TYPE_MPEG4, + .max_width = 2048, + .max_height = 1152, + .max_pixels_per_frame = 2048 * 1152, + .max_level = 5, + }, + { + .codec_type = AMDGPU_VIDEO_CODEC_TYPE_MPEG4_AVC, + .max_width = 2048, + .max_height = 1152, + .max_pixels_per_frame = 2048 * 1152, + .max_level = 41, + }, + { + .codec_type = AMDGPU_VIDEO_CODEC_TYPE_VC1, + .max_width = 2048, + .max_height = 1152, + .max_pixels_per_frame = 2048 * 1152, + .max_level = 4, + }, +}; + +static const struct amdgpu_video_codecs cik_video_codecs_decode = +{ + .codec_count = ARRAY_SIZE(cik_video_codecs_decode_array), + .codec_array = cik_video_codecs_decode_array, +}; + +static int cik_query_video_codecs(struct amdgpu_device *adev, bool encode, + const struct amdgpu_video_codecs **codecs) +{ + switch (adev->asic_type) { + case CHIP_BONAIRE: + case CHIP_HAWAII: + case CHIP_KAVERI: + case CHIP_KABINI: + case CHIP_MULLINS: + if (encode) + *codecs = &cik_video_codecs_encode; + else + *codecs = &cik_video_codecs_decode; + return 0; + default: + return -EINVAL; + } +} + /* * Indirect registers accessor */ @@ -1933,6 +2007,7 @@ static const struct amdgpu_asic_funcs cik_asic_funcs = .get_pcie_replay_count = &cik_get_pcie_replay_count, .supports_baco = &cik_asic_supports_baco, .pre_asic_init = &cik_pre_asic_init, + .query_video_codecs = &cik_query_video_codecs, }; static int cik_common_early_init(void *handle) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index 160fa5f59805..c7ea779e7d6e 100644 --- a/drivers/gpu/drm/amd/amdgpu/nv.c +++ b/drivers/gpu/drm/amd/amdgpu/nv.c @@ -65,6 +65,184 @@ static const struct amd_ip_funcs nv_common_ip_funcs; +/* Navi */ +static const struct amdgpu_video_codec_info nv_video_codecs_encode_array[] = +{ + { + .codec_type = AMDGPU_VIDEO_CODEC_TYPE_MPEG4_AVC, + .max_width = 4096, + .max_height = 2304, + .max_pixels_per_frame = 4096 * 2304, + .max_level = 0, + }, + { + .codec_type = AMDGPU_VIDEO_CODEC_TYPE_HEVC, + .max_width = 4096, + .max_height = 2304, + .max_pixels_per_frame = 4096 * 2304, + .max_level = 0, + }, +}; + +static const struct amdgpu_video_codecs nv_video_codecs_encode = +{ + .codec_count = ARRAY_SIZE(nv_video_codecs_encode_array), + .codec_array = nv_video_codecs_encode_array, +}; + +/* Navi1x */ +static const struct amdgpu_video_codec_info nv_video_codecs_decode_array[] = +{ + { + .codec_type = AMDGPU_VIDEO_CODEC_TYPE_MPEG2, + .max_width = 4096, + .max_height = 4096, + .max_pixels_per_frame = 4096 * 4096, + .max_level = 3, + }, + { + .codec_type =
[PATCH 4/5] drm/amdgpu: bump driver version for new video codec INFO ioctl query
So mesa can check when to query the kernel vs use hardcoded codec bandwidth data. Reviewed-by: Leo Liu Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 4575192d9b08..4acfb136a295 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -90,9 +90,10 @@ * - 3.38.0 - Add AMDGPU_IB_FLAG_EMIT_MEM_SYNC * - 3.39.0 - DMABUF implicit sync does a full pipeline sync * - 3.40.0 - Add AMDGPU_IDS_FLAGS_TMZ + * - 3.41.0 - Add video codec query */ #define KMS_DRIVER_MAJOR 3 -#define KMS_DRIVER_MINOR 40 +#define KMS_DRIVER_MINOR 41 #define KMS_DRIVER_PATCHLEVEL 0 int amdgpu_vram_limit; -- 2.29.2 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 3/5] drm/amdgpu: add INFO ioctl support for querying video caps (v4)
We currently hardcode these in mesa, but querying them from the kernel makes more sense since there may be board specific limitations that the kernel driver is better suited to determining. Userpace patches that use this interface: https://gitlab.freedesktop.org/leoliu/drm/-/commits/info_video_caps https://gitlab.freedesktop.org/leoliu/mesa/-/commits/info_video_caps v2: reorder the codecs to better align with mesa v3: add max_pixels_per_frame to handle the portrait case, squash in memory leak fix v4: drop extra break Reviewed-by: Leo Liu (v2) Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 57 + include/uapi/drm/amdgpu_drm.h | 34 +++ 2 files changed, 91 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index cf81ade31703..9f35e8a6c421 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c @@ -982,6 +982,63 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) min_t(u64, size, sizeof(ras_mask))) ? -EFAULT : 0; } + case AMDGPU_INFO_VIDEO_CAPS: { + const struct amdgpu_video_codecs *codecs; + struct drm_amdgpu_info_video_caps *caps; + int r; + + switch (info->video_cap.type) { + case AMDGPU_INFO_VIDEO_CAPS_DECODE: + r = amdgpu_asic_query_video_codecs(adev, false, &codecs); + if (r) + return -EINVAL; + break; + case AMDGPU_INFO_VIDEO_CAPS_ENCODE: + r = amdgpu_asic_query_video_codecs(adev, true, &codecs); + if (r) + return -EINVAL; + break; + default: + DRM_DEBUG_KMS("Invalid request %d\n", + info->video_cap.type); + return -EINVAL; + } + + caps = kzalloc(sizeof(*caps), GFP_KERNEL); + if (!caps) + return -ENOMEM; + + for (i = 0; i < codecs->codec_count; i++) { + int idx = codecs->codec_array[i].codec_type; + + switch (idx) { + case AMDGPU_VIDEO_CODEC_TYPE_MPEG2: + case AMDGPU_VIDEO_CODEC_TYPE_MPEG4: + case AMDGPU_VIDEO_CODEC_TYPE_MPEG4_AVC: + case AMDGPU_VIDEO_CODEC_TYPE_VC1: + case AMDGPU_VIDEO_CODEC_TYPE_HEVC: + case AMDGPU_VIDEO_CODEC_TYPE_JPEG: + case AMDGPU_VIDEO_CODEC_TYPE_VP9: + case AMDGPU_VIDEO_CODEC_TYPE_AV1: + caps->codec_info[idx].valid = 1; + caps->codec_info[idx].max_width = + codecs->codec_array[i].max_width; + caps->codec_info[idx].max_height = + codecs->codec_array[i].max_height; + caps->codec_info[idx].max_pixels_per_frame = + codecs->codec_array[i].max_pixels_per_frame; + caps->codec_info[idx].max_level = + codecs->codec_array[i].max_level; + break; + default: + break; + } + } + r = copy_to_user(out, caps, +min((size_t)size, sizeof(*caps))) ? -EFAULT : 0; + kfree(caps); + return r; + } default: DRM_DEBUG_KMS("Invalid request %d\n", info->query); return -EINVAL; diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h index 7fb9c09ee93f..728566542f8a 100644 --- a/include/uapi/drm/amdgpu_drm.h +++ b/include/uapi/drm/amdgpu_drm.h @@ -782,6 +782,12 @@ struct drm_amdgpu_cs_chunk_data { #define AMDGPU_INFO_VRAM_LOST_COUNTER 0x1F /* query ras mask of enabled features*/ #define AMDGPU_INFO_RAS_ENABLED_FEATURES 0x20 +/* query video encode/decode caps */ +#define AMDGPU_INFO_VIDEO_CAPS 0x21 + /* Subquery id: Decode */ + #define AMDGPU_INFO_VIDEO_CAPS_DECODE 0 + /* Subquery id: Encode */ + #define AMDGPU_INFO_VIDEO_CAPS_ENCODE 1 /* RAS MASK: UMC (VRAM) */ #define AMDGPU_INFO_RAS_ENABLED_UMC(1 << 0) @@ -878,6 +884,10 @@ struct drm_amdgpu_info { struct { __u32 type; } sensor_info; + + struct { + __u32 type; + } vi
[PATCH 1/5] drm/amdgpu: add asic callback for querying video codec info (v3)
This will be used by a new INFO ioctl query to fetch the decode and encode capabilities from the kernel driver rather than hardcoding them in mesa. This gives us more fine grained control of capabilities using information that is only availabl in the kernel (e.g., platform limitations or bandwidth restrictions). v2: reorder the codecs to better align with mesa v3: add max_pixels_per_frame to handle the portrait case Reviewed-by: Leo Liu (v2) Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 26 ++ 1 file changed, 26 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 22e5d9f284c3..09aec16c8feb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -583,6 +583,28 @@ enum amd_reset_method { AMD_RESET_METHOD_PCI, }; +#define AMDGPU_VIDEO_CODEC_TYPE_MPEG2 0 +#define AMDGPU_VIDEO_CODEC_TYPE_MPEG4 1 +#define AMDGPU_VIDEO_CODEC_TYPE_VC12 +#define AMDGPU_VIDEO_CODEC_TYPE_MPEG4_AVC 3 +#define AMDGPU_VIDEO_CODEC_TYPE_HEVC 4 +#define AMDGPU_VIDEO_CODEC_TYPE_JPEG 5 +#define AMDGPU_VIDEO_CODEC_TYPE_VP96 +#define AMDGPU_VIDEO_CODEC_TYPE_AV17 + +struct amdgpu_video_codec_info { + u32 codec_type; + u32 max_width; + u32 max_height; + u32 max_pixels_per_frame; + u32 max_level; +}; + +struct amdgpu_video_codecs { + const u32 codec_count; + const struct amdgpu_video_codec_info *codec_array; +}; + /* * ASIC specific functions. */ @@ -627,6 +649,9 @@ struct amdgpu_asic_funcs { void (*pre_asic_init)(struct amdgpu_device *adev); /* enter/exit umd stable pstate */ int (*update_umd_stable_pstate)(struct amdgpu_device *adev, bool enter); + /* query video codecs */ + int (*query_video_codecs)(struct amdgpu_device *adev, bool encode, + const struct amdgpu_video_codecs **codecs); }; /* @@ -1221,6 +1246,7 @@ int emu_soc_asic_init(struct amdgpu_device *adev); #define amdgpu_asic_pre_asic_init(adev) (adev)->asic_funcs->pre_asic_init((adev)) #define amdgpu_asic_update_umd_stable_pstate(adev, enter) \ ((adev)->asic_funcs->update_umd_stable_pstate ? (adev)->asic_funcs->update_umd_stable_pstate((adev), (enter)) : 0) +#define amdgpu_asic_query_video_codecs(adev, e, c) (adev)->asic_funcs->query_video_codecs((adev), (e), (c)) #define amdgpu_inc_vram_lost(adev) atomic_inc(&((adev)->vram_lost_counter)); -- 2.29.2 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: add ih call to process until checkpoint
Am 25.02.21 um 19:33 schrieb Felix Kuehling: [SNIP] This in turn can lead to starvation of the work handler and so a life lock as well. I won't touch rptr or wptr at all for this. Not sure what's your idea here, using ih->lock. Is it to completely drain all IRQs until the IH ring is completely empty? Correct. That can live-lock, if the GPU produces IRQs faster than the kernel can process them. Therefore I was looking at rptr and wptr to drain only IRQs that were already in the queue when the drain call was made. That also ensures that the wait time is bounded and should be short (unless the ring size is huge). Correct as well, but the problem here is that Jonathan's implementation is not even remotely correct. See when you look at the rptr and wptr you can't be sure that they haven't wrapped around between two looks. What you could do is look at both the rptr as well as the original wptr, but that is tricky. The code in Jon's patch was suggested by me. I check for wrapping by comparing rptr with the rptr from the previous loop iteration. Comparing rptr with wptr you can never be sure whether rptr has already passed wptr, or whether rptr has to wrap first. Exactly that's my concern as well. I could see a compromise where we sleep and wake up the waiting threads when 1. the IH ring is empty 2. the IH rptr wraps That should be good enough to ensure a quick response in the common case (no interrupt storm), and a reasonable response in the interrupt storm case. But then it's still not clear what's the correct condition for checking that the interrupts the caller cares about have been drained. Looking at just rptr and wptr is always ambiguous. Maybe we could use timestamps of the last processed interrupt? Those 48-bit time stamps wrap much less frequently. The idea is this: * At the start get the timestamp of the last written IH ring entry (checkpoint) * Wait until the last_processed IH timestamp passes the checkpoint: sign_extend(last_processed - checkpoint) >= 0 Yeah that could work. Alternatively we could just have a rptr wrap around counter. This way you do something like this: 1. get the wrap around counter. 2. get wptr 3. get rptr 4. compare the wrap around counter with the old one, if it has changed start over with #1 5. Use wrap around counter and rptr/wptr to come up with 64bit values. 6. Compare wptr with rptr/wrap around counter until we are sure the IHs are processed. Regards, Christian. Regards, Felix Regards, Christian. Regards, Felix Regards, Christian. Regards, Felix Suggested-by: Felix Kuehling Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 46 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 2 ++ 2 files changed, 47 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c index dc852af4f3b7..cae50af9559d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c @@ -22,7 +22,7 @@ */ #include - +#include #include "amdgpu.h" #include "amdgpu_ih.h" @@ -160,6 +160,50 @@ void amdgpu_ih_ring_write(struct amdgpu_ih_ring *ih, const uint32_t *iv, } } +/** + * amdgpu_ih_wait_on_checkpoint_process - wait to process IVs up to +checkpoint + * + * @adev: amdgpu_device pointer + * @ih: ih ring to process + * + * Used to ensure ring has processed IVs up to the checkpoint write pointer. + */ +int amdgpu_ih_wait_on_checkpoint_process(struct amdgpu_device *adev, +struct amdgpu_ih_ring *ih) +{ +u32 prev_rptr, cur_rptr, checkpoint_wptr; + +if (!ih->enabled || adev->shutdown) +return -ENODEV; + +cur_rptr = READ_ONCE(ih->rptr); +/* Order read of current rptr with checktpoint wptr. */ +mb(); +checkpoint_wptr = amdgpu_ih_get_wptr(adev, ih); + +/* allow rptr to wrap around */ +if (cur_rptr > checkpoint_wptr) { +spin_begin(); +do { +spin_cpu_relax(); +prev_rptr = cur_rptr; +cur_rptr = READ_ONCE(ih->rptr); +} while (cur_rptr >= prev_rptr); +spin_end(); That's a certain NAK since it busy waits for IH processing. We need some event to trigger here. The function is meant to be just a waiter up to the checkpoint. There's a need to guarantee that "stale" interrupts have been processed on check before doing other stuff after call. The description could be improved to clarify that. Would busy waiting only on a locked ring help? I assume an unlocked ring means nothing to process so no need to wait and we can exit early. Or is it better to just to process the entries up to the checkpoint (maybe adjust amdgpu_ih_process for this need like adding a bool arg to skip restart or something)? Thanks, Jon +} + +/* wait for rptr to catch up to or pass checkpoint. */ +spin_begin(); +do { +spin_cpu_relax(); +prev_rptr = cur_rptr; +cur_rptr = READ_ONCE(ih->rptr); +} while (cur_rptr >= prev_rptr && cur_rptr < checkpoint_wptr); Same of course here. Chr
[PATCH 135/159] drm/amdgpu: workaround the TMR MC address issue (v2)
From: Oak Zeng With the 2-level gart page table, vram is squeezed into gart aperture and FB aperture is disabled. Therefore all VRAM virtual addresses are in the GART aperture. However currently PSP requires TMR addresses in FB aperture. So we need some design change at PSP FW level to support this 2-level gart table driver change. Right now this PSP FW support doesn't exist. To workaround this issue temporarily, FB aperture is added back and the gart aperture address is converted back to FB aperture for this PSP TMR address. Will revert it after we get a fix from PSP FW. v2: squash in tmr fix for other asics (Kevin) Signed-off-by: Oak Zeng Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 9 + drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 10 ++ drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 21 +++-- drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c | 10 ++ 4 files changed, 40 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h index d5f3825cd479..cd4592ff70ae 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h @@ -208,6 +208,15 @@ struct amdgpu_gmc { */ u64 fb_start; u64 fb_end; + /* In the case of use GART table for vmid0 FB access, [fb_start, fb_end] +* will be squeezed to GART aperture. But we have a PSP FW issue to fix +* for now. To temporarily workaround the PSP FW issue, added below two +* variables to remember the original fb_start/end to re-enable FB +* aperture to workaround the PSP FW issue. Will delete it after we +* get a proper PSP FW fix. +*/ + u64 fb_start_original; + u64 fb_end_original; unsignedvram_width; u64 real_vram_size; int vram_mtrr; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index cf8cfe620d8c..a4f96d931573 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -407,6 +407,16 @@ static int psp_tmr_init(struct psp_context *psp) AMDGPU_GEM_DOMAIN_VRAM, &psp->tmr_bo, &psp->tmr_mc_addr, pptr); + /* workaround the tmr_mc_addr: +* PSP requires an address in FB aperture. Right now driver produce +* tmr_mc_addr in the GART aperture. Convert it back to FB aperture +* for PSP. Will revert it after we get a fix from PSP FW. +*/ + if (psp->adev->asic_type == CHIP_ALDEBARAN) { + psp->tmr_mc_addr -= psp->adev->gmc.fb_start; + psp->tmr_mc_addr += psp->adev->gmc.fb_start_original; + } + return ret; } diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c index 62019885bda5..d189507dcef0 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c @@ -141,12 +141,21 @@ static void gfxhub_v1_0_init_system_aperture_regs(struct amdgpu_device *adev) * FB aperture and AGP aperture. Disable them. */ if (adev->gmc.pdb0_bo) { - WREG32_SOC15(GC, 0, mmMC_VM_FB_LOCATION_TOP, 0); - WREG32_SOC15(GC, 0, mmMC_VM_FB_LOCATION_BASE, 0x00FF); - WREG32_SOC15(GC, 0, mmMC_VM_AGP_TOP, 0); - WREG32_SOC15(GC, 0, mmMC_VM_AGP_BOT, 0xFF); - WREG32_SOC15(GC, 0, mmMC_VM_SYSTEM_APERTURE_LOW_ADDR, 0x3FFF); - WREG32_SOC15(GC, 0, mmMC_VM_SYSTEM_APERTURE_HIGH_ADDR, 0); + if (adev->asic_type == CHIP_ALDEBARAN) { + WREG32_SOC15(GC, 0, mmMC_VM_FB_LOCATION_TOP, adev->gmc.fb_end_original >> 24); + WREG32_SOC15(GC, 0, mmMC_VM_FB_LOCATION_BASE, adev->gmc.fb_start_original >> 24); + WREG32_SOC15(GC, 0, mmMC_VM_AGP_TOP, 0); + WREG32_SOC15(GC, 0, mmMC_VM_AGP_BOT, 0xFF); + WREG32_SOC15(GC, 0, mmMC_VM_SYSTEM_APERTURE_LOW_ADDR, adev->gmc.fb_start_original >> 18); + WREG32_SOC15(GC, 0, mmMC_VM_SYSTEM_APERTURE_HIGH_ADDR, adev->gmc.fb_end_original >> 18); + } else { + WREG32_SOC15(GC, 0, mmMC_VM_FB_LOCATION_TOP, 0); + WREG32_SOC15(GC, 0, mmMC_VM_FB_LOCATION_BASE, 0x00FF); + WREG32_SOC15(GC, 0, mmMC_VM_AGP_TOP, 0); + WREG32_SOC15(GC, 0, mmMC_VM_AGP_BOT, 0xFF); + WREG32_SOC15(GC, 0, mmMC_VM_SYSTEM_APERTURE_LOW_ADDR, 0x3FFF); + WREG32_SOC15(GC, 0, mmMC_VM_SYSTEM_APERTURE_HIGH_ADDR, 0); + } } }
[PATCH 3/3] drm/amdgpu: harvest edc status when connected to host via xGMI
From: Dennis Li When connected to a host via xGMI, system fatal errors may trigger warm reset, driver has no change to query edc status before reset. Therefore in this case, driver should harvest previous error loging registers during boot, instead of only resetting them. v2: 1. IP's ras_manager object is created when its ras feature is enabled, so change to query edc status after amdgpu_ras_late_init called 2. change to enable watchdog timer after finishing gfx edc init Signed-off-by: Dennis Li Reivewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 9 - drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 50 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 5 ++- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 12 ++ drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c | 54 ++--- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.h | 3 +- 7 files changed, 107 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index 8e0a6c62322e..689addb1520d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c @@ -601,6 +601,7 @@ int amdgpu_gfx_ras_late_init(struct amdgpu_device *adev) struct ras_ih_if ih_info = { .cb = amdgpu_gfx_process_ras_data_cb, }; + struct ras_query_if info = { 0 }; if (!adev->gfx.ras_if) { adev->gfx.ras_if = kmalloc(sizeof(struct ras_common_if), GFP_KERNEL); @@ -612,13 +613,19 @@ int amdgpu_gfx_ras_late_init(struct amdgpu_device *adev) strcpy(adev->gfx.ras_if->name, "gfx"); } fs_info.head = ih_info.head = *adev->gfx.ras_if; - r = amdgpu_ras_late_init(adev, adev->gfx.ras_if, &fs_info, &ih_info); if (r) goto free; if (amdgpu_ras_is_supported(adev, adev->gfx.ras_if->block)) { + if (adev->gmc.xgmi.connected_to_cpu) { + info.head = *adev->gfx.ras_if; + amdgpu_ras_query_error_status(adev, &info); + } else { + amdgpu_ras_reset_error_status(adev, AMDGPU_RAS_BLOCK__GFX); + } + r = amdgpu_irq_get(adev, &adev->gfx.cp_ecc_error_irq, 0); if (r) goto late_fini; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h index d92f0f14cbeb..38af93f501e1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h @@ -225,9 +225,9 @@ struct amdgpu_gfx_funcs { void (*reset_ras_error_count) (struct amdgpu_device *adev); void (*init_spm_golden)(struct amdgpu_device *adev); void (*query_ras_error_status) (struct amdgpu_device *adev); + void (*reset_ras_error_status) (struct amdgpu_device *adev); void (*update_perfmon_mgcg)(struct amdgpu_device *adev, bool enable); void (*enable_watchdog_timer)(struct amdgpu_device *adev); - void (*query_sq_timeout_status)(struct amdgpu_device *adev); }; struct sq_work { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index 5805c78c356b..517e19fae34f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -109,7 +109,7 @@ static ssize_t amdgpu_ras_debugfs_read(struct file *f, char __user *buf, ssize_t s; char val[128]; - if (amdgpu_ras_error_query(obj->adev, &info)) + if (amdgpu_ras_query_error_status(obj->adev, &info)) return -EINVAL; s = snprintf(val, sizeof(val), "%s: %lu\n%s: %lu\n", @@ -434,7 +434,7 @@ static ssize_t amdgpu_ras_sysfs_read(struct device *dev, return snprintf(buf, PAGE_SIZE, "Query currently inaccessible\n"); - if (amdgpu_ras_error_query(obj->adev, &info)) + if (amdgpu_ras_query_error_status(obj->adev, &info)) return -EINVAL; return snprintf(buf, PAGE_SIZE, "%s: %lu\n%s: %lu\n", @@ -757,8 +757,8 @@ static int amdgpu_ras_enable_all_features(struct amdgpu_device *adev, /* feature ctl end */ /* query/inject/cure begin */ -int amdgpu_ras_error_query(struct amdgpu_device *adev, - struct ras_query_if *info) +int amdgpu_ras_query_error_status(struct amdgpu_device *adev, + struct ras_query_if *info) { struct ras_manager *obj = amdgpu_ras_find_obj(adev, &info->head); struct ras_err_data err_data = {0, 0, 0, NULL}; @@ -787,10 +787,16 @@ int amdgpu_ras_error_query(struct amdgpu_device *adev, case AMDGPU_RAS_BLOCK__GFX: if (adev->gfx.funcs->query_ras_error_count) adev->gfx.funcs->query_ras_error_count(adev, &err_data); + + if (adev->gfx.funcs->query_ra
[PATCH 2/3] drm/amdgpu: Make noretry the default on Aldebaran
From: Felix Kuehling This is needed for best machine learning performance. XNACK can still be enabled per-process if needed. Signed-off-by: Felix Kuehling Reviewed-by: Alex Deucher Reviewed-by: Philip Yang Tested-by: Alex Sierra Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 6d9c660da27a..8a64f5e49cb2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -508,6 +508,7 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) switch (adev->asic_type) { case CHIP_VEGA10: case CHIP_VEGA20: + case CHIP_ALDEBARAN: /* * noretry = 0 will cause kfd page fault tests fail * for some ASICs, so set default to 1 for these ASICs. -- 2.29.2 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 1/3] drm/amdgpu: update default timeout of Aldebaran SQ watchdog
From: Harish Kasiviswanathan Signed-off-by: Harish Kasiviswanathan Reivewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c | 7 +++ 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 42654deb9c55..29a2fc56e51b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -174,7 +174,7 @@ uint amdgpu_ras_mask = 0x; int amdgpu_bad_page_threshold = 100; struct amdgpu_watchdog_timer amdgpu_watchdog_timer = { .timeout_fatal_disable = false, - .period = 0x3f, /* about 8s */ + .period = 0x23, /* default to max. timeout = 1 << 0x23 cycles */ }; /** @@ -542,7 +542,7 @@ module_param_named(timeout_fatal_disable, amdgpu_watchdog_timer.timeout_fatal_di * DOC: timeout_period (uint) * Modify the watchdog timeout max_cycles as (1 << period) */ -MODULE_PARM_DESC(timeout_period, "watchdog timeout period (0x1F = default), timeout maxCycles = (1 << period)"); +MODULE_PARM_DESC(timeout_period, "watchdog timeout period (1 to 0x23(default), timeout maxCycles = (1 << period)"); module_param_named(timeout_period, amdgpu_watchdog_timer.period, uint, 0644); /** diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c index 1faeae14ead9..44024ab93577 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c @@ -1138,6 +1138,13 @@ void gfx_v9_4_2_enable_watchdog_timer(struct amdgpu_device *adev) data = REG_SET_FIELD(0, SQ_TIMEOUT_CONFIG, TIMEOUT_FATAL_DISABLE, amdgpu_watchdog_timer.timeout_fatal_disable ? 1 : 0); + + if (amdgpu_watchdog_timer.timeout_fatal_disable && + (amdgpu_watchdog_timer.period < 1 || +amdgpu_watchdog_timer.period > 0x23)) { + dev_warn(adev->dev, "Watchdog period range is 1 to 0x23\n"); + amdgpu_watchdog_timer.period = 0x23; + } data = REG_SET_FIELD(data, SQ_TIMEOUT_CONFIG, PERIOD_SEL, amdgpu_watchdog_timer.period); -- 2.29.2 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 147/159] drm/amdgpu: restore aldebaran save ttmp and trap config on init (v2)
Am 2021-02-25 um 1:32 p.m. schrieb Deucher, Alexander: > > [AMD Official Use Only - Internal Distribution Only] > > > I dropped the KFD debugger hunks and just added the gfx 9.4.2 changes > since these were required for a bunch of later patches that build on > that file that are not dependent on debugger. I see. > I can rework the commit message if you'd like. No, that's fine. I'd prefer to keep the original message to correlate it with the remaining change on the DKMS branch. Thanks, Felix > > Alex > > > *From:* Kuehling, Felix > *Sent:* Wednesday, February 24, 2021 10:22 PM > *To:* Deucher, Alexander ; > amd-gfx@lists.freedesktop.org > *Cc:* Kim, Jonathan > *Subject:* Re: [PATCH 147/159] drm/amdgpu: restore aldebaran save ttmp > and trap config on init (v2) > > This patch is for the debugger functionality that's not on > amd-staging-drm-next yet. You can probably drop this patch for now. > > Regards, > Felix > > On 2021-02-24 5:18 p.m., Alex Deucher wrote: > > From: Jonathan Kim > > > > Initialization of TRAP_DATA0/1 is still required for the debugger to > detect > > new waves on Aldebaran. Also, per-vmid global trap enablement may be > > required outside of debugger scope so move to init phase. > > > > v2: just add the gfx 9.4.2 changes (Alex) > > > > Signed-off-by: Jonathan Kim > > Reviewed-by: Felix Kuehling > > Signed-off-by: Alex Deucher > > --- > > drivers/gpu/drm/amd/amdgpu/Makefile | 1 + > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 1 + > > drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c | 50 + > > drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.h | 30 +++ > > 4 files changed, 82 insertions(+) > > create mode 100644 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c > > create mode 100644 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.h > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile > b/drivers/gpu/drm/amd/amdgpu/Makefile > > index c5ec926bc6d5..741b68874e53 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/Makefile > > +++ b/drivers/gpu/drm/amd/amdgpu/Makefile > > @@ -123,6 +123,7 @@ amdgpu-y += \ > > gfx_v8_0.o \ > > gfx_v9_0.o \ > > gfx_v9_4.o \ > > + gfx_v9_4_2.o \ > > gfx_v10_0.o > > > > # add async DMA block > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > index 5bac5659e707..78bb4e28c27c 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > > @@ -49,6 +49,7 @@ > > > > #include "gfx_v9_4.h" > > #include "gfx_v9_0.h" > > +#include "gfx_v9_4_2.h" > > > > #include "asic_reg/pwr/pwr_10_0_offset.h" > > #include "asic_reg/pwr/pwr_10_0_sh_mask.h" > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c > > new file mode 100644 > > index ..0c2ccbe327ab > > --- /dev/null > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c > > @@ -0,0 +1,50 @@ > > +/* > > + * Copyright 2020 Advanced Micro Devices, Inc. > > + * > > + * Permission is hereby granted, free of charge, to any person > obtaining a > > + * copy of this software and associated documentation files (the > "Software"), > > + * to deal in the Software without restriction, including without > limitation > > + * the rights to use, copy, modify, merge, publish, distribute, > sublicense, > > + * and/or sell copies of the Software, and to permit persons to > whom the > > + * Software is furnished to do so, subject to the following conditions: > > + * > > + * The above copyright notice and this permission notice shall be > included in > > + * all copies or substantial portions of the Software. > > + * > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > EXPRESS OR > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > MERCHANTABILITY, > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO > EVENT SHALL > > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, > DAMAGES OR > > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR > OTHERWISE, > > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE > USE OR > > + * OTHER DEALINGS IN THE SOFTWARE. > > + * > > + */ > > +#include "amdgpu.h" > > +#include "soc15.h" > > + > > +#include "gc/gc_9_4_2_offset.h" > > +#include "gc/gc_9_4_2_sh_mask.h" > > + > > +void gfx_v9_4_2_debug_trap_config_init(struct amdgpu_device *adev, > > + uint32_t first_vmid, > > + uint32_t last_vmid) > > +{ > > + uint32_t data; > > + int i; > > + > > + mutex_lock(&adev->srbm_mutex); > > + > > + for (i = first_vmid; i < last_vmid; i++) { > > + data = 0; > > + soc15_grbm_select(adev, 0, 0, 0, i); > > + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, > TRAP_EN, 1); > > + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNT
Re: [PATCH v2] drm/scheduler: Fix hang when sched_entity released
Am 25.02.21 um 17:03 schrieb Andrey Grodzovsky: On 2021-02-25 2:53 a.m., Christian König wrote: Am 24.02.21 um 16:13 schrieb Andrey Grodzovsky: Ping Sorry, I've been on vacation this week. Andrey On 2021-02-20 7:12 a.m., Andrey Grodzovsky wrote: On 2/20/21 3:38 AM, Christian König wrote: Am 18.02.21 um 17:41 schrieb Andrey Grodzovsky: On 2/18/21 10:15 AM, Christian König wrote: Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: On 2/18/21 3:07 AM, Christian König wrote: Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: Problem: If scheduler is already stopped by the time sched_entity is released and entity's job_queue not empty I encountred a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle never becomes false. Fix: In drm_sched_fini detach all sched_entities from the scheduler's run queues. This will satisfy drm_sched_entity_is_idle. Also wakeup all those processes stuck in sched_entity flushing as the scheduler main thread which wakes them up is stopped by now. v2: Reverse order of drm_sched_rq_remove_entity and marking s_entity as stopped to prevent reinserion back to rq due to race. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/scheduler/sched_main.c | 31 +++ 1 file changed, 31 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 908b0b5..c6b7947 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); */ void drm_sched_fini(struct drm_gpu_scheduler *sched) { + int i; + struct drm_sched_entity *s_entity; BTW: Please order that so that i is declared last. if (sched->thread) kthread_stop(sched->thread); + /* Detach all sched_entites from this scheduler once it's stopped */ + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { + struct drm_sched_rq *rq = &sched->sched_rq[i]; + + if (!rq) + continue; + + /* Loop this way because rq->lock is taken in drm_sched_rq_remove_entity */ + spin_lock(&rq->lock); + while ((s_entity = list_first_entry_or_null(&rq->entities, + struct drm_sched_entity, + list))) { + spin_unlock(&rq->lock); + + /* Prevent reinsertion and remove */ + spin_lock(&s_entity->rq_lock); + s_entity->stopped = true; + drm_sched_rq_remove_entity(rq, s_entity); + spin_unlock(&s_entity->rq_lock); Well this spin_unlock/lock dance here doesn't look correct at all now. Christian. In what way ? It's in the same same order as in other call sites (see drm_sched_entity_push_job and drm_sched_entity_flush). If i just locked rq->lock and did list_for_each_entry_safe while manually deleting entity->list instead of calling drm_sched_rq_remove_entity this still would not be possible as the order of lock acquisition between s_entity->rq_lock and rq->lock would be reverse compared to the call sites mentioned above. Ah, now I understand. You need this because drm_sched_rq_remove_entity() will grab the rq lock again! Problem is now what prevents the entity from being destroyed while you remove it? Christian. Right, well, since (unfortunately) sched_entity is part of amdgpu_ctx_entity and amdgpu_ctx_entity is refcounted there is a problem here that we don't increment amdgpu_ctx.refcount when assigning sched_entity to new rq (e.g. before drm_sched_rq_add_entity) and not decrement before removing. We do it for amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init and amdgpu_cs_parser_fini by calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a bit tricky due to all the drm_sched_entity_select_rq logic. Another, kind of a band aid fix, would probably be just locking amdgpu_ctx_mgr.lock around drm_sched_fini when finalizing the fence driver and around idr iteration in amdgpu_ctx_mgr_fini (which should be lock protected anyway as I see from other idr usages in the code) ... This should prevent this use after free. Puh, that's rather complicated as well. Ok let's look at it from the other side for a moment. Why do we have to remove the entities from the rq in the first place? Wouldn't it be sufficient to just set all of them to stopped? Christian. And adding it as another condition in drm_sched_entity_is_idle ? Yes, I think that should work. In this case reverse locking order is created, In drm_sched_entity_push_job and drm_sched_entity_flush lock entity->rq_lock locked first and rq->lock locked second. In drm_sched_fini on the other hand, I need to lock rq->lock first to iterate rq->entities and then lock s_entity->rq_lock for each entity to modify s_entity->stopped. I guess we could change s_entity->stopped to atomic ? Good point. But I think the memory barrier inserted by wake_up
Re: [PATCH] drm/amdgpu: add ih call to process until checkpoint
Am 2021-02-25 um 11:48 a.m. schrieb Christian König: > > > Am 25.02.21 um 16:35 schrieb Felix Kuehling: >> Am 2021-02-25 um 8:53 a.m. schrieb Christian König: >>> >>> Am 25.02.21 um 04:15 schrieb Felix Kuehling: On 2021-02-24 10:54 a.m., Kim, Jonathan wrote: > [AMD Official Use Only - Internal Distribution Only] > >> -Original Message- >> From: Koenig, Christian >> Sent: Wednesday, February 24, 2021 4:17 AM >> To: Kim, Jonathan ; amd- >> g...@lists.freedesktop.org >> Cc: Yang, Philip ; Kuehling, Felix >> >> Subject: Re: [PATCH] drm/amdgpu: add ih call to process until >> checkpoint >> >> Am 23.02.21 um 22:10 schrieb Jonathan Kim: >>> Add IH function to allow caller to process ring entries until the >>> checkpoint write pointer. >> This needs a better description of what this will be used for. > Felix or Philip could elaborate better for HMM needs. > Debugging tools requires this but it's in experimental mode at the > moment so probably not the best place to describe here. On the HMM side we're planning to use this to drain pending page fault interrupts before we unmap memory. That should address phantom VM faults after memory is unmapped. >>> Thought so. I suggest to use a wait_event() here which on the waiter >>> side checks ih->lock and add a wake_up_all() at the end of >>> amdgpu_ih_process. >> Right. I thought about that and it should be easy to add. The reason to >> suggest busy waiting first is, that interrupt processing is supposed to >> be fast. The IRQ handler itself doesn't sleep. So I'd expect the wait >> time to be short enough that sleeping and scheduling is not worth it. > > Well the page fault IRQs are processed in a work item, so we busy wait > for another thread here and not interrupt context. Good point. > > This in turn can lead to starvation of the work handler and so a life > lock as well. > >> >> >>> I won't touch rptr or wptr at all for this. >> Not sure what's your idea here, using ih->lock. Is it to completely >> drain all IRQs until the IH ring is completely empty? > > Correct. > >> That can >> live-lock, if the GPU produces IRQs faster than the kernel can process >> them. Therefore I was looking at rptr and wptr to drain only IRQs that >> were already in the queue when the drain call was made. That also >> ensures that the wait time is bounded and should be short (unless the >> ring size is huge). > > Correct as well, but the problem here is that Jonathan's > implementation is not even remotely correct. > > See when you look at the rptr and wptr you can't be sure that they > haven't wrapped around between two looks. > > What you could do is look at both the rptr as well as the original > wptr, but that is tricky. The code in Jon's patch was suggested by me. I check for wrapping by comparing rptr with the rptr from the previous loop iteration. Comparing rptr with wptr you can never be sure whether rptr has already passed wptr, or whether rptr has to wrap first. I could see a compromise where we sleep and wake up the waiting threads when 1. the IH ring is empty 2. the IH rptr wraps That should be good enough to ensure a quick response in the common case (no interrupt storm), and a reasonable response in the interrupt storm case. But then it's still not clear what's the correct condition for checking that the interrupts the caller cares about have been drained. Looking at just rptr and wptr is always ambiguous. Maybe we could use timestamps of the last processed interrupt? Those 48-bit time stamps wrap much less frequently. The idea is this: * At the start get the timestamp of the last written IH ring entry (checkpoint) * Wait until the last_processed IH timestamp passes the checkpoint: sign_extend(last_processed - checkpoint) >= 0 Regards, Felix > > Regards, > Christian. > >> >> Regards, >> Felix >> >> >>> Regards, >>> Christian. >>> Regards, Felix >>> Suggested-by: Felix Kuehling >>> Signed-off-by: Jonathan Kim >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 46 >> +- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 2 ++ >>> 2 files changed, 47 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c >>> index dc852af4f3b7..cae50af9559d 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c >>> @@ -22,7 +22,7 @@ >>> */ >>> >>> #include >>> - >>> +#include >>> #include "amdgpu.h" >>> #include "amdgpu_ih.h" >>> >>> @@ -160,6 +160,50 @@ void amdgpu_ih_ring_write(struct >> amdgpu_ih_ring *ih, const uint32_t *iv, >>> } >>> } >>> >>> +/** >>> + * amdgpu_ih_wait_on_checkpoint_process - wait to process IVs >>> up to >>> +che
Re: [PATCH 147/159] drm/amdgpu: restore aldebaran save ttmp and trap config on init (v2)
[AMD Official Use Only - Internal Distribution Only] I dropped the KFD debugger hunks and just added the gfx 9.4.2 changes since these were required for a bunch of later patches that build on that file that are not dependent on debugger. I can rework the commit message if you'd like. Alex From: Kuehling, Felix Sent: Wednesday, February 24, 2021 10:22 PM To: Deucher, Alexander ; amd-gfx@lists.freedesktop.org Cc: Kim, Jonathan Subject: Re: [PATCH 147/159] drm/amdgpu: restore aldebaran save ttmp and trap config on init (v2) This patch is for the debugger functionality that's not on amd-staging-drm-next yet. You can probably drop this patch for now. Regards, Felix On 2021-02-24 5:18 p.m., Alex Deucher wrote: > From: Jonathan Kim > > Initialization of TRAP_DATA0/1 is still required for the debugger to detect > new waves on Aldebaran. Also, per-vmid global trap enablement may be > required outside of debugger scope so move to init phase. > > v2: just add the gfx 9.4.2 changes (Alex) > > Signed-off-by: Jonathan Kim > Reviewed-by: Felix Kuehling > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/Makefile | 1 + > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 1 + > drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c | 50 + > drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.h | 30 +++ > 4 files changed, 82 insertions(+) > create mode 100644 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c > create mode 100644 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.h > > diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile > b/drivers/gpu/drm/amd/amdgpu/Makefile > index c5ec926bc6d5..741b68874e53 100644 > --- a/drivers/gpu/drm/amd/amdgpu/Makefile > +++ b/drivers/gpu/drm/amd/amdgpu/Makefile > @@ -123,6 +123,7 @@ amdgpu-y += \ >gfx_v8_0.o \ >gfx_v9_0.o \ >gfx_v9_4.o \ > + gfx_v9_4_2.o \ >gfx_v10_0.o > > # add async DMA block > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > index 5bac5659e707..78bb4e28c27c 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > @@ -49,6 +49,7 @@ > > #include "gfx_v9_4.h" > #include "gfx_v9_0.h" > +#include "gfx_v9_4_2.h" > > #include "asic_reg/pwr/pwr_10_0_offset.h" > #include "asic_reg/pwr/pwr_10_0_sh_mask.h" > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c > new file mode 100644 > index ..0c2ccbe327ab > --- /dev/null > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c > @@ -0,0 +1,50 @@ > +/* > + * Copyright 2020 Advanced Micro Devices, Inc. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > + * OTHER DEALINGS IN THE SOFTWARE. > + * > + */ > +#include "amdgpu.h" > +#include "soc15.h" > + > +#include "gc/gc_9_4_2_offset.h" > +#include "gc/gc_9_4_2_sh_mask.h" > + > +void gfx_v9_4_2_debug_trap_config_init(struct amdgpu_device *adev, > + uint32_t first_vmid, > + uint32_t last_vmid) > +{ > + uint32_t data; > + int i; > + > + mutex_lock(&adev->srbm_mutex); > + > + for (i = first_vmid; i < last_vmid; i++) { > + data = 0; > + soc15_grbm_select(adev, 0, 0, 0, i); > + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, TRAP_EN, 1); > + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_EN, 0); > + data = REG_SET_FIELD(data, SPI_GDBG_PER_VMID_CNTL, EXCP_REPLACE, > + 0); > + WREG32(SOC15_REG_OFFSET(GC, 0, regSPI_GDBG_PER_VMID_CNTL), > data); > + } > + > + soc15_grbm_select(adev, 0, 0, 0, 0); > + mutex_unlock(&adev->srbm_mutex); > +} > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.h > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.h > new file mode 100644 > index ..5b175c10de23 > --- /dev/null > +++ b/driv
Re: [PATCH v6 3/3] drm/amd/display: Skip modeset for front porch change
On 2021-02-12 8:08 p.m., Aurabindo Pillai wrote: [Why] A seamless transition between modes can be performed if the new incoming mode has the same timing parameters as the optimized mode on a display with a variable vtotal min/max. Smooth video playback usecases can be enabled with this seamless transition by switching to a new mode which has a refresh rate matching the video. [How] Skip full modeset if userspace requested a compatible freesync mode which only differs in the front porch timing from the current mode. Signed-off-by: Aurabindo Pillai Acked-by: Christian König --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 220 ++ 1 file changed, 180 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index c472905c7d72..628fec855e14 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -212,6 +212,9 @@ static bool amdgpu_dm_psr_disable_all(struct amdgpu_display_manager *dm); static const struct drm_format_info * amd_get_format_info(const struct drm_mode_fb_cmd2 *cmd); +static bool +is_timing_unchanged_for_freesync(struct drm_crtc_state *old_crtc_state, +struct drm_crtc_state *new_crtc_state); /* * dm_vblank_get_counter * @@ -335,6 +338,17 @@ static inline bool amdgpu_dm_vrr_active(struct dm_crtc_state *dm_state) dm_state->freesync_config.state == VRR_STATE_ACTIVE_FIXED; } +static inline bool is_dc_timing_adjust_needed(struct dm_crtc_state *old_state, + struct dm_crtc_state *new_state) +{ + if (new_state->freesync_config.state == VRR_STATE_ACTIVE_FIXED) + return true; + else if (amdgpu_dm_vrr_active(old_state) != amdgpu_dm_vrr_active(new_state)) + return true; + else + return false; +} + /** * dm_pflip_high_irq() - Handle pageflip interrupt * @interrupt_params: ignored @@ -5008,19 +5022,16 @@ static void fill_stream_properties_from_drm_display_mode( timing_out->hdmi_vic = hv_frame.vic; } - timing_out->h_addressable = mode_in->crtc_hdisplay; - timing_out->h_total = mode_in->crtc_htotal; - timing_out->h_sync_width = - mode_in->crtc_hsync_end - mode_in->crtc_hsync_start; - timing_out->h_front_porch = - mode_in->crtc_hsync_start - mode_in->crtc_hdisplay; - timing_out->v_total = mode_in->crtc_vtotal; - timing_out->v_addressable = mode_in->crtc_vdisplay; - timing_out->v_front_porch = - mode_in->crtc_vsync_start - mode_in->crtc_vdisplay; - timing_out->v_sync_width = - mode_in->crtc_vsync_end - mode_in->crtc_vsync_start; - timing_out->pix_clk_100hz = mode_in->crtc_clock * 10; + timing_out->h_addressable = mode_in->hdisplay; + timing_out->h_total = mode_in->htotal; + timing_out->h_sync_width = mode_in->hsync_end - mode_in->hsync_start; + timing_out->h_front_porch = mode_in->hsync_start - mode_in->hdisplay; + timing_out->v_total = mode_in->vtotal; + timing_out->v_addressable = mode_in->vdisplay; + timing_out->v_front_porch = mode_in->vsync_start - mode_in->vdisplay; + timing_out->v_sync_width = mode_in->vsync_end - mode_in->vsync_start; + timing_out->pix_clk_100hz = mode_in->clock * 10; + timing_out->aspect_ratio = get_aspect_ratio(mode_in); stream->output_color_space = get_output_color_space(timing_out); @@ -5240,6 +5251,33 @@ get_highest_refresh_rate_mode(struct amdgpu_dm_connector *aconnector, return m_pref; } +static bool is_freesync_video_mode(struct drm_display_mode *mode, + struct amdgpu_dm_connector *aconnector) +{ + struct drm_display_mode *high_mode; + int timing_diff; + + high_mode = get_highest_refresh_rate_mode(aconnector, false); + if (!high_mode || !mode) + return false; + + timing_diff = high_mode->vtotal - mode->vtotal; + + if (high_mode->clock == 0 || high_mode->clock != mode->clock || + high_mode->hdisplay != mode->hdisplay || + high_mode->vdisplay != mode->vdisplay || + high_mode->hsync_start != mode->hsync_start || + high_mode->hsync_end != mode->hsync_end || + high_mode->htotal != mode->htotal || + high_mode->hskew != mode->hskew || + high_mode->vscan != mode->vscan || + high_mode->vsync_start - mode->vsync_start != timing_diff || + high_mode->vsync_end - mode->vsync_end != timing_diff) + return false; + else + return true; +} + static struct dc_stream_state * create_stream_for_sink(struct amdgpu_dm_connector *aconnector, const struct drm_display_mode *drm_mode, @@ -5253,8 +5291,10 @@
Re: [PATCH] drm/amdgpu: Remove amdgpu_device arg from free_sgt api
Am 25.02.21 um 03:49 schrieb Ramesh Errabolu: Currently callers have to provide handle of amdgpu_device, which is not used by the implementation. It is unlikely this parameter will become useful in future, thus removing it Signed-off-by: Ramesh Errabolu Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 3 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 3 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 4 +--- 3 files changed, 3 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index 2808d5752de1..e83d73ec0e9d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -359,14 +359,13 @@ static void amdgpu_dma_buf_unmap(struct dma_buf_attachment *attach, struct dma_buf *dma_buf = attach->dmabuf; struct drm_gem_object *obj = dma_buf->priv; struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); - struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); if (sgt->sgl->page_link) { dma_unmap_sgtable(attach->dev, sgt, dir, 0); sg_free_table(sgt); kfree(sgt); } else { - amdgpu_vram_mgr_free_sgt(adev, attach->dev, dir, sgt); + amdgpu_vram_mgr_free_sgt(attach->dev, dir, sgt); } } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h index 927d33d75c22..028f200a3509 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h @@ -121,8 +121,7 @@ int amdgpu_vram_mgr_alloc_sgt(struct amdgpu_device *adev, struct device *dev, enum dma_data_direction dir, struct sg_table **sgt); -void amdgpu_vram_mgr_free_sgt(struct amdgpu_device *adev, - struct device *dev, +void amdgpu_vram_mgr_free_sgt(struct device *dev, enum dma_data_direction dir, struct sg_table *sgt); uint64_t amdgpu_vram_mgr_usage(struct ttm_resource_manager *man); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c index b325b067926b..14936bc713b0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -734,15 +734,13 @@ int amdgpu_vram_mgr_alloc_sgt(struct amdgpu_device *adev, /** * amdgpu_vram_mgr_free_sgt - allocate and fill a sg table * - * @adev: amdgpu device pointer * @dev: device pointer * @dir: data direction of resource to unmap * @sgt: sg table to free * * Free a previously allocate sg table. */ -void amdgpu_vram_mgr_free_sgt(struct amdgpu_device *adev, - struct device *dev, +void amdgpu_vram_mgr_free_sgt(struct device *dev, enum dma_data_direction dir, struct sg_table *sgt) { ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: add ih call to process until checkpoint
Am 25.02.21 um 16:35 schrieb Felix Kuehling: Am 2021-02-25 um 8:53 a.m. schrieb Christian König: Am 25.02.21 um 04:15 schrieb Felix Kuehling: On 2021-02-24 10:54 a.m., Kim, Jonathan wrote: [AMD Official Use Only - Internal Distribution Only] -Original Message- From: Koenig, Christian Sent: Wednesday, February 24, 2021 4:17 AM To: Kim, Jonathan ; amd- g...@lists.freedesktop.org Cc: Yang, Philip ; Kuehling, Felix Subject: Re: [PATCH] drm/amdgpu: add ih call to process until checkpoint Am 23.02.21 um 22:10 schrieb Jonathan Kim: Add IH function to allow caller to process ring entries until the checkpoint write pointer. This needs a better description of what this will be used for. Felix or Philip could elaborate better for HMM needs. Debugging tools requires this but it's in experimental mode at the moment so probably not the best place to describe here. On the HMM side we're planning to use this to drain pending page fault interrupts before we unmap memory. That should address phantom VM faults after memory is unmapped. Thought so. I suggest to use a wait_event() here which on the waiter side checks ih->lock and add a wake_up_all() at the end of amdgpu_ih_process. Right. I thought about that and it should be easy to add. The reason to suggest busy waiting first is, that interrupt processing is supposed to be fast. The IRQ handler itself doesn't sleep. So I'd expect the wait time to be short enough that sleeping and scheduling is not worth it. Well the page fault IRQs are processed in a work item, so we busy wait for another thread here and not interrupt context. This in turn can lead to starvation of the work handler and so a life lock as well. I won't touch rptr or wptr at all for this. Not sure what's your idea here, using ih->lock. Is it to completely drain all IRQs until the IH ring is completely empty? Correct. That can live-lock, if the GPU produces IRQs faster than the kernel can process them. Therefore I was looking at rptr and wptr to drain only IRQs that were already in the queue when the drain call was made. That also ensures that the wait time is bounded and should be short (unless the ring size is huge). Correct as well, but the problem here is that Jonathan's implementation is not even remotely correct. See when you look at the rptr and wptr you can't be sure that they haven't wrapped around between two looks. What you could do is look at both the rptr as well as the original wptr, but that is tricky. Regards, Christian. Regards, Felix Regards, Christian. Regards, Felix Suggested-by: Felix Kuehling Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 46 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 2 ++ 2 files changed, 47 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c index dc852af4f3b7..cae50af9559d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c @@ -22,7 +22,7 @@ */ #include - +#include #include "amdgpu.h" #include "amdgpu_ih.h" @@ -160,6 +160,50 @@ void amdgpu_ih_ring_write(struct amdgpu_ih_ring *ih, const uint32_t *iv, } } +/** + * amdgpu_ih_wait_on_checkpoint_process - wait to process IVs up to +checkpoint + * + * @adev: amdgpu_device pointer + * @ih: ih ring to process + * + * Used to ensure ring has processed IVs up to the checkpoint write pointer. + */ +int amdgpu_ih_wait_on_checkpoint_process(struct amdgpu_device *adev, +struct amdgpu_ih_ring *ih) +{ +u32 prev_rptr, cur_rptr, checkpoint_wptr; + +if (!ih->enabled || adev->shutdown) +return -ENODEV; + +cur_rptr = READ_ONCE(ih->rptr); +/* Order read of current rptr with checktpoint wptr. */ +mb(); +checkpoint_wptr = amdgpu_ih_get_wptr(adev, ih); + +/* allow rptr to wrap around */ +if (cur_rptr > checkpoint_wptr) { +spin_begin(); +do { +spin_cpu_relax(); +prev_rptr = cur_rptr; +cur_rptr = READ_ONCE(ih->rptr); +} while (cur_rptr >= prev_rptr); +spin_end(); That's a certain NAK since it busy waits for IH processing. We need some event to trigger here. The function is meant to be just a waiter up to the checkpoint. There's a need to guarantee that "stale" interrupts have been processed on check before doing other stuff after call. The description could be improved to clarify that. Would busy waiting only on a locked ring help? I assume an unlocked ring means nothing to process so no need to wait and we can exit early. Or is it better to just to process the entries up to the checkpoint (maybe adjust amdgpu_ih_process for this need like adding a bool arg to skip restart or something)? Thanks, Jon +} + +/* wait for rptr to catch up to or pass checkpoint. */ +spin_begin(); +do { +spin_cpu_relax(); +prev_rptr = cur_rptr; +cur_rptr = READ_ONCE(ih->rptr); +} while (cur_rptr >= prev_rptr && cur_rptr < checkpoint_wptr); Same of course here. Christian. +sp
[PATCH] drm/amdgpu: enable TMZ by default on Raven asics
This has been stable for a while. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 10 ++ 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 2f71d36d2856..21504ea9085f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -164,7 +164,7 @@ int amdgpu_discovery = -1; int amdgpu_mes; int amdgpu_noretry = -1; int amdgpu_force_asic_type = -1; -int amdgpu_tmz; +int amdgpu_tmz = -1; /* auto */ int amdgpu_reset_method = -1; /* auto */ int amdgpu_num_kcq = -1; @@ -790,7 +790,7 @@ module_param_named(abmlevel, amdgpu_dm_abm_level, uint, 0444); * * The default value: 0 (off). TODO: change to auto till it is completed. */ -MODULE_PARM_DESC(tmz, "Enable TMZ feature (-1 = auto, 0 = off (default), 1 = on)"); +MODULE_PARM_DESC(tmz, "Enable TMZ feature (-1 = auto (default), 0 = off, 1 = on)"); module_param_named(tmz, amdgpu_tmz, int, 0444); /** diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index fe1a39ffda72..1a892526d020 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -384,6 +384,16 @@ void amdgpu_gmc_tmz_set(struct amdgpu_device *adev) { switch (adev->asic_type) { case CHIP_RAVEN: + if (amdgpu_tmz == 0) { + adev->gmc.tmz_enabled = false; + dev_info(adev->dev, +"Trusted Memory Zone (TMZ) feature disabled (cmd line)\n"); + } else { + adev->gmc.tmz_enabled = true; + dev_info(adev->dev, +"Trusted Memory Zone (TMZ) feature enabled\n"); + } + break; case CHIP_RENOIR: case CHIP_NAVI10: case CHIP_NAVI14: -- 2.29.2 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
On 2021-02-25 5:25 a.m., Daniel Vetter wrote: On Wed, Feb 24, 2021 at 11:30:50AM -0500, Andrey Grodzovsky wrote: On 2021-02-19 5:24 a.m., Daniel Vetter wrote: On Thu, Feb 18, 2021 at 9:03 PM Andrey Grodzovsky wrote: Looked a bit into it, I want to export sync_object to FD and import from that FD such that I will wait on the imported sync object handle from one thread while signaling the exported sync object handle from another (post device unplug) ? My problem is how to create a sync object with a non signaled 'fake' fence ? I only see API that creates it with already signaled fence (or none) - https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fdrm_syncobj.c%23L56&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc6828a032b80464fc0f008d8d977bc32%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498455582209331%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Izca%2BYNliCefXqAgOIX%2Bs3XQ1vWVGXbfAh28B%2F51blQ%3D&reserved=0 P.S I expect the kernel to crash since unlike with dma_bufs we don't hold drm device reference here on export. Well maybe there's no crash. I think if you go through all your dma_fence that you have and force-complete them, then I think external callers wont go into the driver anymore. But there's still pointers potentially pointing at your device struct and all that, but should work. Still needs some audit ofc. Wrt how you get such a free-standing fence, that's amdgpu specific. Roughly - submit cs - get the fence for that (either sync_file, but I don't think amdgpu supports that, or maybe through drm_syncobj) - hotunplug - wait on that fence somehow (drm_syncobj has direct uapi for this, same for sync_file I think) Cheers, Daniel Indeed worked fine, did with 2 devices. Since syncobj is refcounted, even after I destroyed the original syncobj and unplugged the device, the exported syncobj and the fence inside didn't go anywhere. See my 3 tests in my branch on Gitlab https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fagrodzov%2Figt-gpu-tools%2F-%2Fcommits%2Fmaster&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc6828a032b80464fc0f008d8d977bc32%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498455582209331%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=IvoCIggDUV3EgDqOhrJokWei%2B6byg%2Be9cfaJel9y2RY%3D&reserved=0 and let me know if I should go ahead and do a merge request (into which target project/branch ?) or you have more comments. igt still works with patch submission. -Daniel I see, Need to divert to other work for a while, will get to it once I am back to device unplug. Andrey Andrey Andrey On 2/9/21 4:50 AM, Daniel Vetter wrote: Yeah in the end we'd need 2 hw devices for testing full fence functionality. A useful intermediate step would be to just export the fence (either as sync_file, which I think amdgpu doesn't support because no android egl support in mesa) or drm_syncobj (which you can do as standalone fd too iirc), and then just using the fence a bit from userspace (like wait on it or get its status) after the device is unplugged. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v2] drm/scheduler: Fix hang when sched_entity released
On 2021-02-25 2:53 a.m., Christian König wrote: Am 24.02.21 um 16:13 schrieb Andrey Grodzovsky: Ping Sorry, I've been on vacation this week. Andrey On 2021-02-20 7:12 a.m., Andrey Grodzovsky wrote: On 2/20/21 3:38 AM, Christian König wrote: Am 18.02.21 um 17:41 schrieb Andrey Grodzovsky: On 2/18/21 10:15 AM, Christian König wrote: Am 18.02.21 um 16:05 schrieb Andrey Grodzovsky: On 2/18/21 3:07 AM, Christian König wrote: Am 17.02.21 um 22:59 schrieb Andrey Grodzovsky: Problem: If scheduler is already stopped by the time sched_entity is released and entity's job_queue not empty I encountred a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle never becomes false. Fix: In drm_sched_fini detach all sched_entities from the scheduler's run queues. This will satisfy drm_sched_entity_is_idle. Also wakeup all those processes stuck in sched_entity flushing as the scheduler main thread which wakes them up is stopped by now. v2: Reverse order of drm_sched_rq_remove_entity and marking s_entity as stopped to prevent reinserion back to rq due to race. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/scheduler/sched_main.c | 31 +++ 1 file changed, 31 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 908b0b5..c6b7947 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -897,9 +897,40 @@ EXPORT_SYMBOL(drm_sched_init); */ void drm_sched_fini(struct drm_gpu_scheduler *sched) { + int i; + struct drm_sched_entity *s_entity; BTW: Please order that so that i is declared last. if (sched->thread) kthread_stop(sched->thread); + /* Detach all sched_entites from this scheduler once it's stopped */ + for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { + struct drm_sched_rq *rq = &sched->sched_rq[i]; + + if (!rq) + continue; + + /* Loop this way because rq->lock is taken in drm_sched_rq_remove_entity */ + spin_lock(&rq->lock); + while ((s_entity = list_first_entry_or_null(&rq->entities, + struct drm_sched_entity, + list))) { + spin_unlock(&rq->lock); + + /* Prevent reinsertion and remove */ + spin_lock(&s_entity->rq_lock); + s_entity->stopped = true; + drm_sched_rq_remove_entity(rq, s_entity); + spin_unlock(&s_entity->rq_lock); Well this spin_unlock/lock dance here doesn't look correct at all now. Christian. In what way ? It's in the same same order as in other call sites (see drm_sched_entity_push_job and drm_sched_entity_flush). If i just locked rq->lock and did list_for_each_entry_safe while manually deleting entity->list instead of calling drm_sched_rq_remove_entity this still would not be possible as the order of lock acquisition between s_entity->rq_lock and rq->lock would be reverse compared to the call sites mentioned above. Ah, now I understand. You need this because drm_sched_rq_remove_entity() will grab the rq lock again! Problem is now what prevents the entity from being destroyed while you remove it? Christian. Right, well, since (unfortunately) sched_entity is part of amdgpu_ctx_entity and amdgpu_ctx_entity is refcounted there is a problem here that we don't increment amdgpu_ctx.refcount when assigning sched_entity to new rq (e.g. before drm_sched_rq_add_entity) and not decrement before removing. We do it for amdgpu_cs_parser.entity for example (in amdgpu_cs_parser_init and amdgpu_cs_parser_fini by calling amdgpu_ctx_get and amdgpu_ctx_put). But this seems a bit tricky due to all the drm_sched_entity_select_rq logic. Another, kind of a band aid fix, would probably be just locking amdgpu_ctx_mgr.lock around drm_sched_fini when finalizing the fence driver and around idr iteration in amdgpu_ctx_mgr_fini (which should be lock protected anyway as I see from other idr usages in the code) ... This should prevent this use after free. Puh, that's rather complicated as well. Ok let's look at it from the other side for a moment. Why do we have to remove the entities from the rq in the first place? Wouldn't it be sufficient to just set all of them to stopped? Christian. And adding it as another condition in drm_sched_entity_is_idle ? Yes, I think that should work. In this case reverse locking order is created, In drm_sched_entity_push_job and drm_sched_entity_flush lock entity->rq_lock locked first and rq->lock locked second. In drm_sched_fini on the other hand, I need to lock rq->lock first to iterate rq->entities and then lock s_entity->rq_lock for each entity to modify s_entity->stopped. I guess we could change s_entity->stopped to atomic ? Andrey Christian. Andrey Andrey Andrey + + spin_lock(&rq->lock); +
[PATCH] amdgpu/pm: read_sensor() report failure apporpriately
report -ENOTSUPP instead of -EINVAL, so that if userspace fails to read sensor data can figure it out the failure correctly. Signed-off-by: Shirish S --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/kv_dpm.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/si_dpm.c | 2 +- 8 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c index 2c90f715ee0a..c932b632ddd4 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c @@ -1285,7 +1285,7 @@ static int smu10_read_sensor(struct pp_hwmgr *hwmgr, int idx, *size = 4; break; default: - ret = -EINVAL; + ret = -EOPNOTSUPP; break; } diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c index c57dc9ae81f2..a58f92249c53 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c @@ -3945,7 +3945,7 @@ static int smu7_read_sensor(struct pp_hwmgr *hwmgr, int idx, *((uint32_t *)value) = (uint32_t)convert_to_vddc(val_vid); return 0; default: - return -EINVAL; + return -EOPNOTSUPP; } } diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c index ed9b89980184..2cef9c0c6d6f 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c @@ -1805,7 +1805,7 @@ static int smu8_read_sensor(struct pp_hwmgr *hwmgr, int idx, *((uint32_t *)value) = smu8_thermal_get_temperature(hwmgr); return 0; default: - return -EINVAL; + return -EOPNOTSUPP; } } diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c index 29c99642d22d..5e875ad8d633 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c @@ -3890,7 +3890,7 @@ static int vega10_read_sensor(struct pp_hwmgr *hwmgr, int idx, *size = 8; break; default: - ret = -EINVAL; + ret = -EOPNOTSUPP; break; } diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c index c0753029a8e2..a827f2bc7904 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c @@ -1429,7 +1429,7 @@ static int vega12_read_sensor(struct pp_hwmgr *hwmgr, int idx, *size = 8; break; default: - ret = -EINVAL; + ret = -EOPNOTSUPP; break; } return ret; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c index 87811b005b85..e8eec2539c17 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c @@ -2240,7 +2240,7 @@ static int vega20_read_sensor(struct pp_hwmgr *hwmgr, int idx, *size = 8; break; default: - ret = -EINVAL; + ret = -EOPNOTSUPP; break; } return ret; diff --git a/drivers/gpu/drm/amd/pm/powerplay/kv_dpm.c b/drivers/gpu/drm/amd/pm/powerplay/kv_dpm.c index 66daabebee35..bcae42cef374 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/kv_dpm.c +++ b/drivers/gpu/drm/amd/pm/powerplay/kv_dpm.c @@ -3305,7 +3305,7 @@ static int kv_dpm_read_sensor(void *handle, int idx, *size = 4; return 0; default: - return -EINVAL; + return -EOPNOTSUPP; } } diff --git a/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c b/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c index 62291358fb1c..26a5321e621b 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c +++ b/drivers/gpu/drm/amd/pm/powerplay/si_dpm.c @@ -8014,7 +8014,7 @@ static int si_dpm_read_sensor(void *handle, int idx, *size = 4; return 0; default: - return -EINVAL; + return -EOPNOTSUPP; } } -- 2.17.1 __
Re: [PATCH] drm/amdgpu: add ih call to process until checkpoint
Am 2021-02-25 um 8:53 a.m. schrieb Christian König: > > > Am 25.02.21 um 04:15 schrieb Felix Kuehling: >> On 2021-02-24 10:54 a.m., Kim, Jonathan wrote: >>> [AMD Official Use Only - Internal Distribution Only] >>> -Original Message- From: Koenig, Christian Sent: Wednesday, February 24, 2021 4:17 AM To: Kim, Jonathan ; amd- g...@lists.freedesktop.org Cc: Yang, Philip ; Kuehling, Felix Subject: Re: [PATCH] drm/amdgpu: add ih call to process until checkpoint Am 23.02.21 um 22:10 schrieb Jonathan Kim: > Add IH function to allow caller to process ring entries until the > checkpoint write pointer. This needs a better description of what this will be used for. >>> Felix or Philip could elaborate better for HMM needs. >>> Debugging tools requires this but it's in experimental mode at the >>> moment so probably not the best place to describe here. >> >> On the HMM side we're planning to use this to drain pending page >> fault interrupts before we unmap memory. That should address phantom >> VM faults after memory is unmapped. > > Thought so. I suggest to use a wait_event() here which on the waiter > side checks ih->lock and add a wake_up_all() at the end of > amdgpu_ih_process. Right. I thought about that and it should be easy to add. The reason to suggest busy waiting first is, that interrupt processing is supposed to be fast. The IRQ handler itself doesn't sleep. So I'd expect the wait time to be short enough that sleeping and scheduling is not worth it. > I won't touch rptr or wptr at all for this. Not sure what's your idea here, using ih->lock. Is it to completely drain all IRQs until the IH ring is completely empty? That can live-lock, if the GPU produces IRQs faster than the kernel can process them. Therefore I was looking at rptr and wptr to drain only IRQs that were already in the queue when the drain call was made. That also ensures that the wait time is bounded and should be short (unless the ring size is huge). Regards, Felix > > Regards, > Christian. > >> >> Regards, >> Felix >> >> >>> > Suggested-by: Felix Kuehling > Signed-off-by: Jonathan Kim > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 46 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 2 ++ > 2 files changed, 47 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > index dc852af4f3b7..cae50af9559d 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > @@ -22,7 +22,7 @@ > */ > > #include > - > +#include > #include "amdgpu.h" > #include "amdgpu_ih.h" > > @@ -160,6 +160,50 @@ void amdgpu_ih_ring_write(struct amdgpu_ih_ring *ih, const uint32_t *iv, > } > } > > +/** > + * amdgpu_ih_wait_on_checkpoint_process - wait to process IVs up to > +checkpoint > + * > + * @adev: amdgpu_device pointer > + * @ih: ih ring to process > + * > + * Used to ensure ring has processed IVs up to the checkpoint write pointer. > + */ > +int amdgpu_ih_wait_on_checkpoint_process(struct amdgpu_device *adev, > +struct amdgpu_ih_ring *ih) > +{ > +u32 prev_rptr, cur_rptr, checkpoint_wptr; > + > +if (!ih->enabled || adev->shutdown) > +return -ENODEV; > + > +cur_rptr = READ_ONCE(ih->rptr); > +/* Order read of current rptr with checktpoint wptr. */ > +mb(); > +checkpoint_wptr = amdgpu_ih_get_wptr(adev, ih); > + > +/* allow rptr to wrap around */ > +if (cur_rptr > checkpoint_wptr) { > +spin_begin(); > +do { > +spin_cpu_relax(); > +prev_rptr = cur_rptr; > +cur_rptr = READ_ONCE(ih->rptr); > +} while (cur_rptr >= prev_rptr); > +spin_end(); That's a certain NAK since it busy waits for IH processing. We need some event to trigger here. >>> The function is meant to be just a waiter up to the checkpoint. >>> There's a need to guarantee that "stale" interrupts have been >>> processed on check before doing other stuff after call. >>> The description could be improved to clarify that. >>> >>> Would busy waiting only on a locked ring help? I assume an unlocked >>> ring means nothing to process so no need to wait and we can exit >>> early. Or is it better to just to process the entries up to the >>> checkpoint (maybe adjust amdgpu_ih_process for this need like adding >>> a bool arg to skip restart or something)? >>> >>> Thanks, >>> >>> Jon >>> > +} > + > +/* wait for rptr to catch up to or pass checkpoint. */ > +spin_begin(); > +do { > +spin_cpu_relax(); > +prev_rptr = cur_rptr; > +cur_rptr = READ_ONCE(ih->rptr); > +} while (cur_rptr >= prev_rptr && cur_rptr < checkpoint_wptr); Same of course here. Christian. > +spin_e
[PATCH] drm/amd/display: Fix an uninitialized index variable
From: Arnd Bergmann clang points out that the new logic uses an always-uninitialized array index: drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:9810:38: warning: variable 'i' is uninitialized when used here [-Wuninitialized] timing = &edid->detailed_timings[i]; ^ drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:9720:7: note: initialize the variable 'i' to silence this warning My best guess is that the index should have been returned by the parse_hdmi_amd_vsdb() function that walks an array here, so do that. Fixes: f9b4f20c4777 ("drm/amd/display: Add Freesync HDMI support to DM") Signed-off-by: Arnd Bergmann --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c| 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index b19b93c74bae..667c0d52dbfa 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -9736,7 +9736,7 @@ static bool parse_edid_cea(struct amdgpu_dm_connector *aconnector, return false; } -static bool parse_hdmi_amd_vsdb(struct amdgpu_dm_connector *aconnector, +static int parse_hdmi_amd_vsdb(struct amdgpu_dm_connector *aconnector, struct edid *edid, struct amdgpu_hdmi_vsdb_info *vsdb_info) { uint8_t *edid_ext = NULL; @@ -9746,7 +9746,7 @@ static bool parse_hdmi_amd_vsdb(struct amdgpu_dm_connector *aconnector, /*- drm_find_cea_extension() -*/ /* No EDID or EDID extensions */ if (edid == NULL || edid->extensions == 0) - return false; + return -ENODEV; /* Find CEA extension */ for (i = 0; i < edid->extensions; i++) { @@ -9756,14 +9756,15 @@ static bool parse_hdmi_amd_vsdb(struct amdgpu_dm_connector *aconnector, } if (i == edid->extensions) - return false; + return -ENODEV; /*- cea_db_offsets() -*/ if (edid_ext[0] != CEA_EXT) - return false; + return -ENODEV; valid_vsdb_found = parse_edid_cea(aconnector, edid_ext, EDID_LENGTH, vsdb_info); - return valid_vsdb_found; + + return valid_vsdb_found ? i : -ENODEV; } void amdgpu_dm_update_freesync_caps(struct drm_connector *connector, @@ -9781,7 +9782,6 @@ void amdgpu_dm_update_freesync_caps(struct drm_connector *connector, struct amdgpu_device *adev = drm_to_adev(dev); bool freesync_capable = false; struct amdgpu_hdmi_vsdb_info vsdb_info = {0}; - bool hdmi_valid_vsdb_found = false; if (!connector->state) { DRM_ERROR("%s - Connector has no state", __func__); @@ -9857,8 +9857,8 @@ void amdgpu_dm_update_freesync_caps(struct drm_connector *connector, } } } else if (edid && amdgpu_dm_connector->dc_sink->sink_signal == SIGNAL_TYPE_HDMI_TYPE_A) { - hdmi_valid_vsdb_found = parse_hdmi_amd_vsdb(amdgpu_dm_connector, edid, &vsdb_info); - if (hdmi_valid_vsdb_found && vsdb_info.freesync_supported) { + i = parse_hdmi_amd_vsdb(amdgpu_dm_connector, edid, &vsdb_info); + if (i >= 0 && vsdb_info.freesync_supported) { timing = &edid->detailed_timings[i]; data= &timing->data.other_data; -- 2.29.2 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: enable one vf mode on navi21 vf
[AMD Official Use Only - Internal Distribution Only] Acked-by: Alex Deucher From: Horace Chen Sent: Thursday, February 25, 2021 7:04 AM To: amd-gfx@lists.freedesktop.org Cc: Grodzovsky, Andrey ; Quan, Evan ; Chen, Horace ; Tuikov, Luben ; Koenig, Christian ; Deucher, Alexander ; Xiao, Jack ; Zhang, Hawking ; Liu, Monk ; Xu, Feifei ; Wang, Kevin(Yang) ; Xiaojie Yuan Subject: [PATCH] drm/amdgpu: enable one vf mode on navi21 vf navi21 vf needs one vf mode which allows vf to set and get clock status from guest vm. So now expose the required interface and allow some smu request on VF mode. Also since navi21 blocked direct MMIO access, use KIQ to send SMU request under sriov vf. OD use same command as getting pp table which is not allowed for navi21, so remove OD feature under sriov vf. Signed-off-by: Horace Chen --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++ drivers/gpu/drm/amd/pm/amdgpu_pm.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c| 10 ++ .../gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 10 +- drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 12 ++-- 5 files changed, 20 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f0f7ed42ee7f..dfbf2f2db0de 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2043,6 +2043,8 @@ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev) adev->pm.pp_feature = amdgpu_pp_feature_mask; if (amdgpu_sriov_vf(adev) || sched_policy == KFD_SCHED_POLICY_NO_HWS) adev->pm.pp_feature &= ~PP_GFXOFF_MASK; + if (amdgpu_sriov_vf(adev) && adev->asic_type == CHIP_SIENNA_CICHLID) + adev->pm.pp_feature &= ~PP_OVERDRIVE_MASK; for (i = 0; i < adev->num_ip_blocks; i++) { if ((amdgpu_ip_block_mask & (1 << i)) == 0) { diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index b770dd634ab6..1866cbaf70c3 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -2167,7 +2167,7 @@ static ssize_t amdgpu_get_gpu_metrics(struct device *dev, static struct amdgpu_device_attr amdgpu_device_attrs[] = { AMDGPU_DEVICE_ATTR_RW(power_dpm_state, ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF), - AMDGPU_DEVICE_ATTR_RW(power_dpm_force_performance_level, ATTR_FLAG_BASIC), + AMDGPU_DEVICE_ATTR_RW(power_dpm_force_performance_level, ATTR_FLAG_BASIC|ATTR_FLAG_ONEVF), AMDGPU_DEVICE_ATTR_RO(pp_num_states, ATTR_FLAG_BASIC), AMDGPU_DEVICE_ATTR_RO(pp_cur_state, ATTR_FLAG_BASIC), AMDGPU_DEVICE_ATTR_RW(pp_force_state, ATTR_FLAG_BASIC), diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index d143ef1b460b..7033d52eb4d0 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -612,10 +612,12 @@ static int smu_late_init(void *handle) return ret; } - ret = smu_set_default_od_settings(smu); - if (ret) { - dev_err(adev->dev, "Failed to setup default OD settings!\n"); - return ret; + if (smu->od_enabled) { + ret = smu_set_default_od_settings(smu); + if (ret) { + dev_err(adev->dev, "Failed to setup default OD settings!\n"); + return ret; + } } ret = smu_populate_umd_state_clk(smu); diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c index af73e1430af5..441effc23625 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c @@ -89,17 +89,17 @@ static struct cmn2asic_msg_mapping sienna_cichlid_message_map[SMU_MSG_MAX_COUNT] MSG_MAP(GetEnabledSmuFeaturesHigh, PPSMC_MSG_GetRunningSmuFeaturesHigh, 1), MSG_MAP(SetWorkloadMask,PPSMC_MSG_SetWorkloadMask, 1), MSG_MAP(SetPptLimit,PPSMC_MSG_SetPptLimit, 0), - MSG_MAP(SetDriverDramAddrHigh, PPSMC_MSG_SetDriverDramAddrHigh, 0), - MSG_MAP(SetDriverDramAddrLow, PPSMC_MSG_SetDriverDramAddrLow, 0), + MSG_MAP(SetDriverDramAddrHigh, PPSMC_MSG_SetDriverDramAddrHigh, 1), + MSG_MAP(SetDriverDramAddrLow, PPSMC_MSG_SetDriverDramAddrLow, 1), MSG_MAP(SetToolsDramAddrHigh, PPSMC_MSG_SetToolsDramAddrHigh,0), MSG_MAP(SetToolsDramAddrLow,PPSMC_MSG_SetToolsDramAd
[PATCH] drm/amd/display: fix 64-bit integer division
From: Arnd Bergmann The new display synchronization code caused a regression on all 32-bit architectures: ld.lld: error: undefined symbol: __aeabi_uldivmod >>> referenced by dce_clock_source.c >>> >>> gpu/drm/amd/display/dc/dce/dce_clock_source.o:(get_pixel_clk_frequency_100hz) >>> in archive drivers/built-in.a ld.lld: error: undefined symbol: __aeabi_ldivmod >>> referenced by dc_resource.c >>> >>> gpu/drm/amd/display/dc/core/dc_resource.o:(resource_are_vblanks_synchronizable) >>> in archive drivers/built-in.a >>> referenced by dc_resource.c >>> >>> gpu/drm/amd/display/dc/core/dc_resource.o:(resource_are_vblanks_synchronizable) >>> in archive drivers/built-in.a >>> referenced by dc_resource.c >>> >>> gpu/drm/amd/display/dc/core/dc_resource.o:(resource_are_vblanks_synchronizable) >>> in archive drivers/built-in.a This is not a fast path, so the use of an explicit div_u64/div_s64 seems appropriate. Fixes: 77a2b7265f20 ("drm/amd/display: Synchronize displays with different timings") Signed-off-by: Arnd Bergmann --- drivers/gpu/drm/amd/display/dc/core/dc_resource.c| 12 ++-- .../gpu/drm/amd/display/dc/dce/dce_clock_source.c| 6 +++--- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c index 0241c9d96d7a..49214c59c836 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c @@ -441,15 +441,15 @@ bool resource_are_vblanks_synchronizable( if (stream2->timing.pix_clk_100hz*100/stream2->timing.h_total/ stream2->timing.v_total > 60) return false; - frame_time_diff = (int64_t)1 * + frame_time_diff = div_s64(1ll * stream1->timing.h_total * stream1->timing.v_total * - stream2->timing.pix_clk_100hz / - stream1->timing.pix_clk_100hz / - stream2->timing.h_total / - stream2->timing.v_total; + stream2->timing.pix_clk_100hz, + stream1->timing.pix_clk_100hz * + stream2->timing.h_total * + stream2->timing.v_total); for (i = 0; i < rr_count; i++) { - int64_t diff = (frame_time_diff * base60_refresh_rates[i]) / 10 - 1; + int64_t diff = div_s64(frame_time_diff * base60_refresh_rates[i], 10) - 1; if (diff < 0) diff = -diff; diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c b/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c index 6f47f9bab5ee..85ed6f2c9647 100644 --- a/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c @@ -1013,9 +1013,9 @@ static bool get_pixel_clk_frequency_100hz( * not be programmed equal to DPREFCLK */ modulo_hz = REG_READ(MODULO[inst]); - *pixel_clk_khz = ((uint64_t)clock_hz* - clock_source->ctx->dc->clk_mgr->dprefclk_khz*10)/ - modulo_hz; + *pixel_clk_khz = div_u64((uint64_t)clock_hz * 10 * + clock_source->ctx->dc->clk_mgr->dprefclk_khz, + modulo_hz); } else { /* NOTE: There is agreement with VBIOS here that MODULO is * programmed equal to DPREFCLK, in which case PHASE will be -- 2.29.2 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amd/display: remove unnecessary conversion to bool
Fix the following coccicheck warnings: ./drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp_cm.c:243:67-72: WARNING: conversion to bool not needed here. Reported-by: Abaci Robot Signed-off-by: Jiapeng Chong --- drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp_cm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp_cm.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp_cm.c index 3398540..102f6a0 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp_cm.c +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp_cm.c @@ -240,7 +240,7 @@ bool dpp3_program_gamcor_lut( next_mode = LUT_RAM_A; dpp3_power_on_gamcor_lut(dpp_base, true); - dpp3_configure_gamcor_lut(dpp_base, next_mode == LUT_RAM_A ? true:false); + dpp3_configure_gamcor_lut(dpp_base, next_mode == LUT_RAM_A); if (next_mode == LUT_RAM_B) { gam_regs.start_cntl_b = REG(CM_GAMCOR_RAMB_START_CNTL_B); -- 1.8.3.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: add ih call to process until checkpoint
Am 25.02.21 um 04:15 schrieb Felix Kuehling: On 2021-02-24 10:54 a.m., Kim, Jonathan wrote: [AMD Official Use Only - Internal Distribution Only] -Original Message- From: Koenig, Christian Sent: Wednesday, February 24, 2021 4:17 AM To: Kim, Jonathan ; amd- g...@lists.freedesktop.org Cc: Yang, Philip ; Kuehling, Felix Subject: Re: [PATCH] drm/amdgpu: add ih call to process until checkpoint Am 23.02.21 um 22:10 schrieb Jonathan Kim: Add IH function to allow caller to process ring entries until the checkpoint write pointer. This needs a better description of what this will be used for. Felix or Philip could elaborate better for HMM needs. Debugging tools requires this but it's in experimental mode at the moment so probably not the best place to describe here. On the HMM side we're planning to use this to drain pending page fault interrupts before we unmap memory. That should address phantom VM faults after memory is unmapped. Thought so. I suggest to use a wait_event() here which on the waiter side checks ih->lock and add a wake_up_all() at the end of amdgpu_ih_process. I won't touch rptr or wptr at all for this. Regards, Christian. Regards, Felix Suggested-by: Felix Kuehling Signed-off-by: Jonathan Kim --- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 46 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 2 ++ 2 files changed, 47 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c index dc852af4f3b7..cae50af9559d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c @@ -22,7 +22,7 @@ */ #include - +#include #include "amdgpu.h" #include "amdgpu_ih.h" @@ -160,6 +160,50 @@ void amdgpu_ih_ring_write(struct amdgpu_ih_ring *ih, const uint32_t *iv, } } +/** + * amdgpu_ih_wait_on_checkpoint_process - wait to process IVs up to +checkpoint + * + * @adev: amdgpu_device pointer + * @ih: ih ring to process + * + * Used to ensure ring has processed IVs up to the checkpoint write pointer. + */ +int amdgpu_ih_wait_on_checkpoint_process(struct amdgpu_device *adev, +struct amdgpu_ih_ring *ih) +{ +u32 prev_rptr, cur_rptr, checkpoint_wptr; + +if (!ih->enabled || adev->shutdown) +return -ENODEV; + +cur_rptr = READ_ONCE(ih->rptr); +/* Order read of current rptr with checktpoint wptr. */ +mb(); +checkpoint_wptr = amdgpu_ih_get_wptr(adev, ih); + +/* allow rptr to wrap around */ +if (cur_rptr > checkpoint_wptr) { +spin_begin(); +do { +spin_cpu_relax(); +prev_rptr = cur_rptr; +cur_rptr = READ_ONCE(ih->rptr); +} while (cur_rptr >= prev_rptr); +spin_end(); That's a certain NAK since it busy waits for IH processing. We need some event to trigger here. The function is meant to be just a waiter up to the checkpoint. There's a need to guarantee that "stale" interrupts have been processed on check before doing other stuff after call. The description could be improved to clarify that. Would busy waiting only on a locked ring help? I assume an unlocked ring means nothing to process so no need to wait and we can exit early. Or is it better to just to process the entries up to the checkpoint (maybe adjust amdgpu_ih_process for this need like adding a bool arg to skip restart or something)? Thanks, Jon +} + +/* wait for rptr to catch up to or pass checkpoint. */ +spin_begin(); +do { +spin_cpu_relax(); +prev_rptr = cur_rptr; +cur_rptr = READ_ONCE(ih->rptr); +} while (cur_rptr >= prev_rptr && cur_rptr < checkpoint_wptr); Same of course here. Christian. +spin_end(); + +return 0; +} + /** * amdgpu_ih_process - interrupt handler * diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h index 6ed4a85fc7c3..6817f0a812d2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h @@ -87,6 +87,8 @@ int amdgpu_ih_ring_init(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih, void amdgpu_ih_ring_fini(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih); void amdgpu_ih_ring_write(struct amdgpu_ih_ring *ih, const uint32_t *iv, unsigned int num_dw); +int amdgpu_ih_wait_on_checkpoint_process(struct amdgpu_device *adev, +struct amdgpu_ih_ring *ih); int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih); void amdgpu_ih_decode_iv_helper(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih, ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 088/159] drm/ttm: ioremap buffer properly according to TTM placement flag
The whole patch set needs a rebase since the TTM_PL_FLAG_* for controlling the caching doesn't exists any more upstream. How should we approach that? Thanks, Christian. Am 24.02.21 um 23:17 schrieb Alex Deucher: From: Oak Zeng If TTM placement flag is cached, buffer is intended to be mapped as cached from CPU. Map it with ioremap_cache. This wasn't necessary before as device memory was never mapped as cached from CPU side. It becomes necessary for aldebaran as device memory is mapped cached from CPU. Signed-off-by: Oak Zeng Reviewed-by: Christian Koenig Tested-by: Amber Lin Signed-off-by: Alex Deucher --- drivers/gpu/drm/ttm/ttm_bo_util.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index ee04716b2603..e11ec1ff5d0b 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -519,6 +519,10 @@ static int ttm_bo_ioremap(struct ttm_buffer_object *bo, map->virtual = ioremap_wc(bo->mem.bus.base + bo->mem.bus.offset + offset, size); + else if (mem->placement & TTM_PL_FLAG_CACHED) + map->virtual = ioremap_cache(bo->mem.bus.base + + bo->mem.bus.offset + offset, + size); else map->virtual = ioremap(bo->mem.bus.base + bo->mem.bus.offset + offset, ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v4 00/14] RFC Support hot device unplug in amdgpu
On Wed, Feb 24, 2021 at 11:30:50AM -0500, Andrey Grodzovsky wrote: > > On 2021-02-19 5:24 a.m., Daniel Vetter wrote: > > On Thu, Feb 18, 2021 at 9:03 PM Andrey Grodzovsky > > wrote: > > > Looked a bit into it, I want to export sync_object to FD and import from > > > that FD > > > such that I will wait on the imported sync object handle from one thread > > > while > > > signaling the exported sync object handle from another (post device > > > unplug) ? > > > > > > My problem is how to create a sync object with a non signaled 'fake' > > > fence ? > > > I only see API that creates it with already signaled fence (or none) - > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fdrm_syncobj.c%23L56&data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C5085bdd151c642514d2408d8d4c08e56%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637493270792459284%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=sZWIn0Lo7ZujBq0e7MdFPhJDARXWpOlLgLzANMS8cCY%3D&reserved=0 > > > > > > P.S I expect the kernel to crash since unlike with dma_bufs we don't hold > > > drm device reference here on export. > > Well maybe there's no crash. I think if you go through all your > > dma_fence that you have and force-complete them, then I think external > > callers wont go into the driver anymore. But there's still pointers > > potentially pointing at your device struct and all that, but should > > work. Still needs some audit ofc. > > > > Wrt how you get such a free-standing fence, that's amdgpu specific. Roughly > > - submit cs > > - get the fence for that (either sync_file, but I don't think amdgpu > > supports that, or maybe through drm_syncobj) > > - hotunplug > > - wait on that fence somehow (drm_syncobj has direct uapi for this, > > same for sync_file I think) > > > > Cheers, Daniel > > > Indeed worked fine, did with 2 devices. Since syncobj is refcounted, even > after I > destroyed the original syncobj and unplugged the device, the exported > syncobj and the > fence inside didn't go anywhere. > > See my 3 tests in my branch on Gitlab > https://gitlab.freedesktop.org/agrodzov/igt-gpu-tools/-/commits/master > and let me know if I should go ahead and do a merge request (into which > target project/branch ?) or you > have more comments. igt still works with patch submission. -Daniel > > Andrey > > > > > > > Andrey > > > > > > On 2/9/21 4:50 AM, Daniel Vetter wrote: > > > > Yeah in the end we'd need 2 hw devices for testing full fence > > > > functionality. A useful intermediate step would be to just export the > > > > fence (either as sync_file, which I think amdgpu doesn't support because > > > > no android egl support in mesa) or drm_syncobj (which you can do as > > > > standalone fd too iirc), and then just using the fence a bit from > > > > userspace (like wait on it or get its status) after the device is > > > > unplugged. > > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amd/amdgpu: move inc gpu_reset_counter after drm_sched_stop
Am 25.02.21 um 10:16 schrieb Jingwen Chen: Move gpu_reset_counter after drm_sched_stop to avoid race condition caused by job submitted between reset_count +1 and drm_sched_stop. Signed-off-by: Jingwen Chen Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f0f7ed42ee7f..703b96cf3560 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4447,7 +4447,6 @@ static bool amdgpu_device_lock_adev(struct amdgpu_device *adev, down_write(&adev->reset_sem); } - atomic_inc(&adev->gpu_reset_counter); switch (amdgpu_asic_reset_method(adev)) { case AMD_RESET_METHOD_MODE1: adev->mp1_state = PP_MP1_STATE_SHUTDOWN; @@ -4708,6 +4707,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, if (need_emergency_restart) amdgpu_job_stop_all_jobs_on_sched(&ring->sched); } + atomic_inc(&tmp_adev->gpu_reset_counter); } if (need_emergency_restart) @@ -5050,6 +5050,7 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta drm_sched_stop(&ring->sched, NULL); } + atomic_inc(&adev->gpu_reset_counter); return PCI_ERS_RESULT_NEED_RESET; case pci_channel_io_perm_failure: /* Permanent error, prepare for device removal */ ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amd/amdgpu: move inc gpu_reset_counter after drm_sched_stop
Move gpu_reset_counter after drm_sched_stop to avoid race condition caused by job submitted between reset_count +1 and drm_sched_stop. Signed-off-by: Jingwen Chen --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f0f7ed42ee7f..703b96cf3560 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4447,7 +4447,6 @@ static bool amdgpu_device_lock_adev(struct amdgpu_device *adev, down_write(&adev->reset_sem); } - atomic_inc(&adev->gpu_reset_counter); switch (amdgpu_asic_reset_method(adev)) { case AMD_RESET_METHOD_MODE1: adev->mp1_state = PP_MP1_STATE_SHUTDOWN; @@ -4708,6 +4707,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, if (need_emergency_restart) amdgpu_job_stop_all_jobs_on_sched(&ring->sched); } + atomic_inc(&tmp_adev->gpu_reset_counter); } if (need_emergency_restart) @@ -5050,6 +5050,7 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta drm_sched_stop(&ring->sched, NULL); } + atomic_inc(&adev->gpu_reset_counter); return PCI_ERS_RESULT_NEED_RESET; case pci_channel_io_perm_failure: /* Permanent error, prepare for device removal */ -- 2.25.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: Replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE
Fix the following coccicheck warning: ./drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1589:0-23: WARNING: fops_ib_preempt should be defined with DEFINE_DEBUGFS_ATTRIBUTE ./drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:1592:0-23: WARNING: fops_sclk_set should be defined with DEFINE_DEBUGFS_ATTRIBUTE Reported-by: Abaci Robot Signed-off-by: Yang Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index 0a25fec..52ef488 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -1586,10 +1586,10 @@ static int amdgpu_debugfs_sclk_set(void *data, u64 val) return 0; } -DEFINE_SIMPLE_ATTRIBUTE(fops_ib_preempt, NULL, +DEFINE_DEBUGFS_ATTRIBUTE(fops_ib_preempt, NULL, amdgpu_debugfs_ib_preempt, "%llu\n"); -DEFINE_SIMPLE_ATTRIBUTE(fops_sclk_set, NULL, +DEFINE_DEBUGFS_ATTRIBUTE(fops_sclk_set, NULL, amdgpu_debugfs_sclk_set, "%llu\n"); int amdgpu_debugfs_init(struct amdgpu_device *adev) -- 1.8.3.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v3 01/11] drm/atomic: Pass the full state to planes async atomic check and update
Hi, On Wed, Feb 24, 2021 at 12:33:45PM +0100, Thomas Zimmermann wrote: > Hi Maxime, > > for the whole series: > > Acked-by: Thomas Zimmermann Applied the whole series, thanks to everyone involved in the review, it's been a pretty daunting one :) Maxime signature.asc Description: PGP signature ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH 2/2] drm/amd/amdgpu: force flush resubmit job
[AMD Official Use Only - Internal Distribution Only] Yeah, that sounds better than original fix Thanks Christian -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Koenig, Christian Sent: Thursday, February 25, 2021 4:08 PM To: Chen, JingWen ; amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: Re: [PATCH 2/2] drm/amd/amdgpu: force flush resubmit job Good catch, but the approach for the fix is incorrect. The device reset count can only be incremented after taking the reset lock and stopping the scheduler, otherwise a whole bunch of different race conditions can happen. Christian. Am 25.02.21 um 08:56 schrieb Chen, JingWen: > [AMD Official Use Only - Internal Distribution Only] > > Consider this sequence: > 1. GPU reset begin > 2. device reset count + 1 > 3. job id 1 scheduled with vm_need_flush=false 4. When handling this > job in vm_flush, amdgpu_vmid_had_gpu_reset will return true, thus this > job will be flush and the vmid_reset_count will be set equals to > device_reset_count 5. stop drm scheduler 6. GPU do real reset 7. > resubmit job id 1 with vm_need_flush=false 8. Job id 1 is the first > job to resubmit after reset. This time amdgpu_vmid_had_gpu_reset > returns false and the vm_need_flush==false > > Then no one will flush pd_addr and vmid for jobs after resubmit. In this > sequence amdgpu_vmid_had_gpu_reset doesn't work. > > > Best Regards, > JingWen Chen > > -Original Message- > From: Koenig, Christian > Sent: Thursday, February 25, 2021 3:46 PM > To: Chen, JingWen ; > amd-gfx@lists.freedesktop.org > Cc: Liu, Monk > Subject: Re: [PATCH 2/2] drm/amd/amdgpu: force flush resubmit job > > > > Am 25.02.21 um 06:27 schrieb Jingwen Chen: >> [Why] >> when a job is scheduled during TDR(after device reset count increase >> and before drm_sched_stop), this job won't do vm_flush when resubmit >> itself after GPU reset done. This can lead to a page fault. >> >> [How] >> Always do vm_flush for resubmit job. > NAK, what do you think amdgpu_vmid_had_gpu_reset already does? > > Christian. > >> Signed-off-by: Jingwen Chen >> --- >>drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 ++- >>1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> index fdbe7d4e8b8b..4af2c5d15950 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >> @@ -1088,7 +1088,8 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct >> amdgpu_job *job, >>if (update_spm_vmid_needed && adev->gfx.rlc.funcs->update_spm_vmid) >>adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid); >> >> -if (amdgpu_vmid_had_gpu_reset(adev, id)) { >> +if (amdgpu_vmid_had_gpu_reset(adev, id)|| (job->base.flags & >> +DRM_FLAG_RESUBMIT_JOB)) { >>gds_switch_needed = true; >>vm_flush_needed = true; >>pasid_mapping_needed = true; ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 2/2] drm/amd/amdgpu: force flush resubmit job
Good catch, but the approach for the fix is incorrect. The device reset count can only be incremented after taking the reset lock and stopping the scheduler, otherwise a whole bunch of different race conditions can happen. Christian. Am 25.02.21 um 08:56 schrieb Chen, JingWen: [AMD Official Use Only - Internal Distribution Only] Consider this sequence: 1. GPU reset begin 2. device reset count + 1 3. job id 1 scheduled with vm_need_flush=false 4. When handling this job in vm_flush, amdgpu_vmid_had_gpu_reset will return true, thus this job will be flush and the vmid_reset_count will be set equals to device_reset_count 5. stop drm scheduler 6. GPU do real reset 7. resubmit job id 1 with vm_need_flush=false 8. Job id 1 is the first job to resubmit after reset. This time amdgpu_vmid_had_gpu_reset returns false and the vm_need_flush==false Then no one will flush pd_addr and vmid for jobs after resubmit. In this sequence amdgpu_vmid_had_gpu_reset doesn't work. Best Regards, JingWen Chen -Original Message- From: Koenig, Christian Sent: Thursday, February 25, 2021 3:46 PM To: Chen, JingWen ; amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: Re: [PATCH 2/2] drm/amd/amdgpu: force flush resubmit job Am 25.02.21 um 06:27 schrieb Jingwen Chen: [Why] when a job is scheduled during TDR(after device reset count increase and before drm_sched_stop), this job won't do vm_flush when resubmit itself after GPU reset done. This can lead to a page fault. [How] Always do vm_flush for resubmit job. NAK, what do you think amdgpu_vmid_had_gpu_reset already does? Christian. Signed-off-by: Jingwen Chen --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index fdbe7d4e8b8b..4af2c5d15950 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -1088,7 +1088,8 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, if (update_spm_vmid_needed && adev->gfx.rlc.funcs->update_spm_vmid) adev->gfx.rlc.funcs->update_spm_vmid(adev, job->vmid); -if (amdgpu_vmid_had_gpu_reset(adev, id)) { +if (amdgpu_vmid_had_gpu_reset(adev, id)|| +(job->base.flags & DRM_FLAG_RESUBMIT_JOB)) { gds_switch_needed = true; vm_flush_needed = true; pasid_mapping_needed = true; ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx