[PATCH] Documentation/amdgpu: Clarify MI200 and MI300 entries

2024-07-04 Thread Kent Russell
Add "Series" to MI200 and MI300 to clarify that they represent the series of cards, and to more closely match the product information materials. This also matches other entries in this list Also correct a typo in the MI300 codename (Vangaram->Vanjaram) Signed-off-by:

[PATCH v2] drm/amdkfd: Fix L2 cache size reporting in GFX9.4.3

2024-02-06 Thread Kent Russell
Its currently incorrectly multiplied by number of XCCs in the partition Fixes: 6b537864925e ("drm/amdkfd: Update cache info for GFX 9.4.3") Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-)

[PATCH] drm/amdkfd: Don't divide L2 cache by partition mode

2024-02-06 Thread Kent Russell
Partition mode only affects L3 cache size. After removing the L2 check in the previous patch, make sure we aren't dividing all cache sizes by partition mode, just L3. Fixes: a75bfb3c4045 ("drm/amdkfd: Fix L2 cache size reporting in GFX9.4.3") Signed-off-by: Kent Russell --- drivers/g

[PATCH] drm/amdkfd: Fix L2 cache size reporting in GFX9.4.3

2024-02-06 Thread Kent Russell
Its currently incorrectly multiplied by number of XCCs in the partition Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd

[PATCH] drm/amdkfd: Align unique_id format to match amdgpu

2023-09-14 Thread Kent Russell
unique_id is printed as %016llx in amdgpu, but %llu in KFD. Call the sysfs_show_gen_prop function directly and use the %016llx format, to align with amdgpu. Don't need to add a new macro since this is a one-off. Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2 +- 1

[PATCH] drm/amdkfd: Ratelimit oversubscription message

2023-03-02 Thread Kent Russell
On certain applications, this message could end up flooding dmesg. Ratelimit it so that the warning is still available, but doesn't take over the entire log Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion

[PATCH] drm/amdgpu: Handle potential NULL pointer dereference

2022-08-15 Thread Kent Russell
If m is NULL, we will end up referencing a NULL pointer in the subsequent m elements like extcpu, bank and status. Pull the NULL check out and do it first before referencing m's elements. Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 - 1 file changed, 4

[PATCH] drm/amdgpu: Fix acronym typo in glossary

2022-07-12 Thread Kent Russell
The initialism of RunList Controller is RLC, not RCL Signed-off-by: Kent Russell --- Documentation/gpu/amdgpu/amdgpu-glossary.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/gpu/amdgpu/amdgpu-glossary.rst b/Documentation/gpu/amdgpu/amdgpu-glossary.rst

[PATCH] drm/amdgpu: Fix typos in amdgpu_stop_pending_resets

2022-06-28 Thread Kent Russell
Change amdggpu to amdgpu and pedning to pending Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index

[PATCH] drm/amdgpu: Fix unique_id references for Sienna Cichlid

2022-03-30 Thread Kent Russell
are used to try to get the unique_id/serial_number. Signed-off-by: Kent Russell --- .../swsmu/inc/pmfw_if/smu11_driver_if_sienna_cichlid.h | 6 -- .../gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c| 10 -- 2 files changed, 4 insertions(+), 12 deletions(-) diff --git a/drivers

[PATCH 4/4] drm/amdgpu: Add unique_id support for sienna cichlid

2022-03-29 Thread Kent Russell
This is being added to SMU Metrics, so add the required tie-ins in the kernel. Also create the corresponding unique_id sysfs file. v2: Add FW version check, remove SMU mutex v3: Fix style warning v4: Add MP1 IP_VERSION check to FW version check Signed-off-by: Kent Russell --- drivers/gpu/drm

[PATCH 3/4] drm/amdgpu: Use metrics data function to get unique_id for Aldebaran

2022-03-29 Thread Kent Russell
This is abstracted well enough in the get_metrics_data function, so use the function Signed-off-by: Kent Russell --- .../gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13

[PATCH 1/4] drm/amdgpu: Use switch case for unique_id

2022-03-29 Thread Kent Russell
that assumes that it is supported unless it is not one of the specified IP_VERSIONs. v2: Rebase onto previous IP_VERSION change Signed-off-by: Kent Russell Reviewed-by: Alex Deucher Reviewed-by: Kevin Wang --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 13 + 1 file changed, 9 insertions(+), 4

[PATCH 2/4] drm/amdgpu: Add UNIQUE_ID to MetricsMember_t

2022-03-29 Thread Kent Russell
This will allow us to use the generic *_get_metrics_data functions for ASICs that support unique_id Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h b/drivers/gpu

[PATCH 2/2] drm/amdgpu: Add unique_id support for sienna cichlid

2022-03-28 Thread Kent Russell
This is being added to SMU Metrics, so add the required tie-ins in the kernel. Also create the corresponding unique_id sysfs file. v2: Add FW version check, remove SMU mutex v3: Fix style warning v4: Add MP1 IP_VERSION check to FW version check Signed-off-by: Kent Russell Reviewed-by: Alex

[PATCH 1/2] drm/amdgpu: Use switch case for unique_id

2022-03-28 Thread Kent Russell
that assumes that it is supported unless it is not one of the specified IP_VERSIONs. v2: Rebase onto previous IP_VERSION change Signed-off-by: Kent Russell Reviewed-by: Alex Deucher Reviewed-by: Kevin Wang --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 13 + 1 file changed, 9 insertions(+), 4

[PATCH 1/2] drm/amdgpu: Use switch case for unique_id

2022-03-28 Thread Kent Russell
that assumes that it is supported unless it is not one of the specified IP_VERSIONs. v2: Rebase onto previous IP_VERSION change Signed-off-by: Kent Russell Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git

[PATCH 2/2] drm/amdgpu: Add unique_id support for sienna cichlid

2022-03-28 Thread Kent Russell
This is being added to SMU Metrics, so add the required tie-ins in the kernel. Also create the corresponding unique_id sysfs file. v2: Add FW version check, remove SMU mutex v3: Fix style warning v4: Add IP_VERSION check to FW version check Signed-off-by: Kent Russell Reviewed-by: Alex Deucher

[PATCH 2/2] drm/amdgpu: Add unique_id support for sienna cichlid

2022-03-28 Thread Kent Russell
This is being added to SMU Metrics, so add the required tie-ins in the kernel. Also create the corresponding unique_id sysfs file. v2: Add FW version check, remove SMU mutex v3: Fix style warning Signed-off-by: Kent Russell Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/pm/amdgpu_pm.c

[PATCH 1/2] drm/amdgpu: Use switch case for unique_id

2022-03-28 Thread Kent Russell
that assumes that it is supported unless it is not one of the specified IP_VERSIONs. v2: Rebase onto previous IP_VERSION change Signed-off-by: Kent Russell Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git

[PATCH 1/2] drm/amdgpu: Change unique_id to use IP_VERSION

2022-03-28 Thread Kent Russell
This is transitioning throughout amdgpu, so we may as well get it started now. This also cleans up the logic on what IP_VERSIONs do or don't support unique_id. Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 12 1 file changed, 8 insertions(+), 4 deletions

[PATCH 2/2] drm/amdgpu: Add unique_id support for sienna cichlid

2022-03-28 Thread Kent Russell
This is being added to SMU Metrics, so add the required tie-ins in the kernel. Also create the corresponding unique_id sysfs file. v2: Add FW version check, remove SMU mutex Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/pm/amdgpu_pm.c| 1 + .../pmfw_if

[PATCH] drm/amdgpu: Add unique_id support for sienna cichlid

2022-03-25 Thread Kent Russell
This is being added to SMU Metrics, so add the required tie-ins in the kernel. Also create the corresponding unique_id sysfs file. v2: Add FW version check, remove SMU mutex Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/pm/amdgpu_pm.c| 3 +- .../pmfw_if

[PATCH] drm/amdkfd: Drop IH ring overflow message to dbg

2022-02-18 Thread Kent Russell
-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c index 9defdbbb4ff8..7041a6714baa 100644 --- a/drivers/gpu/drm/amd/amdkfd

[PATCH] drm/amdgpu: Add unique_id support for sienna cichlid

2022-02-10 Thread Kent Russell
This is being added to SMU Metrics, so add the required tie-ins in the kernel. Also create the corresponding unique_id sysfs file. Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/pm/amdgpu_pm.c| 3 +- .../pmfw_if/smu11_driver_if_sienna_cichlid.h | 12 +-- .../amd/pm/swsmu

[PATCH] drm/amdgpu: Add unique_id support for sienna cichlid

2022-02-10 Thread Kent Russell
This is being added to SMU Metrics, so add the required tie-ins in the kernel Signed-off-by: Kent Russell --- .../pmfw_if/smu11_driver_if_sienna_cichlid.h | 12 +-- .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 33 +++ 2 files changed, 43 insertions(+), 2 deletions

[PATCH] drm/amdkfd: Fix ASIC name typos

2022-01-11 Thread Kent Russell
Three misspelled ASICs in comments here, so fix the spelling Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index

[PATCH 4/4] drm/amdgpu: Access the FRU on Aldebaran

2021-12-17 Thread Kent Russell
This is supported, although the offset is different from VG20, so fix that with a variable and enable getting the product name and serial number from the FRU. Do this for all SKUs since all SKUs have the FRU Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 13

[PATCH 3/4] drm/amdgpu: Only overwrite serial if field is empty

2021-12-17 Thread Kent Russell
On Aldebaran, the serial may be obtained from the FRU. Only overwrite the serial with the unique_id if the serial is empty. This will support printing serial numbers for mGPU devices where there are 2 unique_ids for the 2 GPUs, but only one serial number for the board Signed-off-by: Kent Russell

[PATCH 2/4] drm/amdgpu: Enable unique_id for Aldebaran

2021-12-17 Thread Kent Russell
It's supported, so support the unique_id sysfs file Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index 082539c70fd4..dfefb147ac2c

[PATCH 1/4] drm/amdgpu: Increase potential product_name to 64 characters

2021-12-17 Thread Kent Russell
Having seen at least 1 42-character product_name, bump the number up to 64, and put that definition into amdgpu.h to make future adjustments simpler. Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 12

[PATCH 4/4] drm/amdgpu: Access the FRU on Aldebaran

2021-12-13 Thread Kent Russell
This is supported, although the offset is different from VG20, so fix that with a variable and enable getting the product name and serial number from the FRU. Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 12 +--- 1 file changed, 9 insertions(+), 3

[PATCH 2/4] drm/amdgpu: Enable unique_id for Aldebaran

2021-12-13 Thread Kent Russell
It's supported, so support the unique_id sysfs file Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index 082539c70fd4..dfefb147ac2c

[PATCH 3/4] drm/amdgpu: Only overwrite serial if field is empty

2021-12-13 Thread Kent Russell
On Aldebaran, the serial may be obtained from the FRU. Only overwrite the serial with the unique_id if the serial is empty. This will support printing serial numbers for mGPU devices where there are 2 unique_ids for the 2 GPUs, but only one serial number for the board Signed-off-by: Kent Russell

[PATCH 1/4] drm/amdgpu: Increase potential product_name to 64 characters

2021-12-13 Thread Kent Russell
Having seen at least 1 42-character product_name, bump the number up to 64. Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 8 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers

[PATCH] drm/amdgpu: Fix the freeing of compute pasids

2021-11-08 Thread Kent Russell
to gracefully handle double-frees. Moreover, once we set the context to compute (is_compute_context=true), that won't change during the lifespan of the process. Due to that, we can guarantee that the pasid will be cleaned up during KFD cleanup. Signed-off-by: Kent Russell --- drivers/gpu/drm/amd

[PATCH] drm/amdgpu: Make sure to reserve BOs before adding or removing

2021-11-01 Thread Kent Russell
BOs need to be reserved before they are added or removed, so ensure that they are reserved during kfd_mem_attach and kfd_mem_detach Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git

[PATCH 21.40 2/2] drm/amdgpu: Temporary exclude GTT BOs from KFD memory attachment

2021-11-01 Thread Kent Russell
that since we don't fully support IOMMU in device-isolation mode at this time, this temporary workaround won't break anything that isn't already broken. Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm

[PATCH 21.40 1/2] drm/amdgpu: Make sure to reserve BOs before adding or removing

2021-11-01 Thread Kent Russell
BOs need to be reserved before they are added or removed, so ensure that they are reserved during kfd_mem_attach and kfd_mem_detach Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git

[PATCH 2/3] drm/amdgpu: Add kernel parameter support for ignoring bad page threshold

2021-10-21 Thread Kent Russell
the GPU, while printing a warning to dmesg that this action has been done Cc: Luben Tuikov Cc: Mukul Joshi Signed-off-by: Kent Russell Acked-by: Felix Kuehling Reviewed-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 2

[PATCH 3/3] drm/amdgpu: Make EEPROM messages dev_ instead of DRM_

2021-10-21 Thread Kent Russell
Since the EEPROM is specific to the device for each of these messages, use the dev_* macro instead of DRM_* to make it easier to identify the GPU that correlates to the EEPROM messages. Signed-off-by: Kent Russell --- .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c| 40 +-- 1

[PATCH 1/3] drm/amdgpu: Warn when bad pages approaches 90% threshold

2021-10-21 Thread Kent Russell
dmesg doesn't warn when the number of bad pages approaches the threshold for page retirement. WARN when the number of bad pages is at 90% or greater for easier checks and planning, instead of waiting until the GPU is full of bad pages. Cc: Luben Tuikov Cc: Mukul Joshi Signed-off-by: Kent

[PATCH 2/2] drm/amdgpu: Add kernel parameter support for ignoring bad page threshold

2021-10-21 Thread Kent Russell
the GPU, while printing a warning to dmesg that this action has been done Cc: Luben Tuikov Cc: Mukul Joshi Signed-off-by: Kent Russell Acked-by: Felix Kuehling Reviewed-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 2

[PATCH 1/2] drm/amdgpu: Warn when bad pages approaches 90% threshold

2021-10-21 Thread Kent Russell
dmesg doesn't warn when the number of bad pages approaches the threshold for page retirement. WARN when the number of bad pages is at 90% or greater for easier checks and planning, instead of waiting until the GPU is full of bad pages. Cc: Luben Tuikov Cc: Mukul Joshi Signed-off-by: Kent

[PATCH 3/3] drm/amdgpu: Implement bad_page_threshold = -2 case

2021-10-20 Thread Kent Russell
If the bad_page_threshold kernel parameter is set to -2, continue to post the GPU. Print a warning to dmesg that this action has been done, and that page retirement will obviously not work for said GPU Cc: Luben Tuikov Cc: Mukul Joshi Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu

[PATCH 1/3] drm/amdgpu: Warn when bad pages approaches 90% threshold

2021-10-20 Thread Kent Russell
-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 17 + 1 file changed, 17 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index f4c05ff4b26c..1ede0f0d6f55 100644 --- a/drivers/gpu/drm/amd

[PATCH 2/3] drm/amdgpu: Add kernel parameter support for ignoring bad page threshold

2021-10-20 Thread Kent Russell
the GPU, even when the bad page threshold has been reached. Cc: Luben Tuikov Cc: Mukul Joshi Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm

[PATCH 4/4] drm/amdgpu: Implement ignore_bad_page_threshold parameter

2021-10-19 Thread Kent Russell
If the ignore_bad_page_threshold kernel parameter is set to true, continue to post the GPU. Print an warning to dmesg that this action has been done, and that page retirement will obviously not work for said GPU Cc: Luben Tuikov Cc: Mukul Joshi Signed-off-by: Kent Russell --- drivers/gpu/drm

[PATCH 2/4] drm/amdgpu: Clarify error when hitting bad page threshold

2021-10-19 Thread Kent Russell
Change the error message when the bad_page_threshold is reached, explicitly stating that the GPU will not be initialized. Cc: Luben Tuikov Cc: Mukul Joshi Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff

[PATCH 3/4] drm/amdgpu: Add kernel parameter for ignoring bad page threshold

2021-10-19 Thread Kent Russell
initialize the GPU, even when the bad page threshold has been reached. Cc: Luben Tuikov Cc: Mukul Joshi Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 13 + 2 files changed, 14 insertions(+) diff --git a/drivers

[PATCH 1/4] drm/amdgpu: Warn when bad pages approaches threshold

2021-10-19 Thread Kent Russell
-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index 98732518543e..8270aad23a06 100644 --- a/drivers/gpu/drm/amd/amdgpu

[PATCH] drm/amdgpu: Warn when bad pages approaches threshold

2021-10-14 Thread Kent Russell
Currently dmesg doesn't warn when the number of bad pages approaches the threshold for page retirement. WARN when the number of bad pages is at 90% or greater for easier checks and planning, instead of waiting until the GPU is full of bad pages Signed-off-by: Kent Russell --- drivers/gpu/drm

[PATCH] drm/amdgpu: Warn when bad pages approaches threshold

2021-10-12 Thread Kent Russell
Currently dmesg doesn't warn when the number of bad pages approaches the threshold for page retirement. WARN when the number of bad pages is at 90% or greater for easier checks and planning, instead of waiting until the GPU is full of bad pages Signed-off-by: Kent Russell --- drivers/gpu/drm

[PATCH] drm/amdgpu: Ensure dcefclk isn't created on Aldebaran

2021-04-12 Thread Kent Russell
Like Arcturus, this isn't available on Aldebaran, so remove it accordingly Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index

[PATCH] drm/amdkfd: Get unique_id dynamically v2

2021-02-03 Thread Kent Russell
. v2: Drop previous patch printing unique_id in hex Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 6 +++--- drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 1 - 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b

[PATCH] drm/amdkfd: Get unique_id dynamically

2021-02-03 Thread Kent Russell
. Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 6 +++--- drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 1 - 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c index

[PATCH] drm/amdkfd: Print unique_id as hex instead of decimal

2021-02-03 Thread Kent Russell
Add a new helper function for printing Topology values to support printing 64-bit hex values. Use this for unique_id to ensure that the unique_id in KFD's topology matches the one in amdgpu's sysfs pool. Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 4 +++- 1 file

[PATCH] drm/amdgpu: Fix Arcturus fan speed reporting

2020-11-05 Thread Kent Russell
l return 0 for the fan speed. Trying to use the smu_v11_0_get_fan_speed_rpm function will return invalid data, so just stick with the SMU metrics for Arcturus Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-)

[PATCH] amdkfd: Check kvmalloc return before memcpy

2020-11-02 Thread Kent Russell
If we can't kvmalloc the pcrat_image, then we shouldn't memcpy Signed-off-by: Kent Russell Reported-by: kernel test robot --- drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd

[PATCH] drm/amdkfd: Fix getting unique_id in topology

2020-10-28 Thread Kent Russell
only need it in the KFD node properties struct Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 -- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 3 --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2 +- 3 files changed, 1 insertion(+), 6 deletions(-) diff --git

[PATCH 1/2] drm/amdkfd: Fix getting unique_id in topology

2020-10-28 Thread Kent Russell
Since the unique_id is now obtained in amdgpu in smu_late_init, topology's device addition is now happening before the unique_id is saved, thus topology misses it. To work around this, we use the amdgpu_amdkfd_get_unique_id to get the unique_id at read time. Signed-off-by: Kent Russell

[PATCH 2/2] drm/amdkfd: Change unique_id to print hex format

2020-10-28 Thread Kent Russell
amdgpu's unique_id prints in hex format, so change topology's printout to hex by adding a new sysfs_print macro specifically for hex output, and use it for unique_id Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion

[PATCH] drm/amdkfd: Use kvfree in destroy_crat_image

2020-10-14 Thread Kent Russell
Now that we use kvmalloc for the crat_image, we need to use kvfree when we destroy this. Fixes: a522a06f8044 ("drm/amdkfd: Use kvmalloc instead of kmalloc for VCRAT") Reported-by: Morris Zhang Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 2 +- 1 file

[PATCH] drm/amdgpu: Use SKU instead of DID for FRU check v2

2020-09-29 Thread Kent Russell
-by: Kent Russell --- .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c| 35 +-- 1 file changed, 24 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c index e811fecc540f..8f4a8f8d8146 100644

[PATCH] drm/amdgpu: Use SKU instead of DID for FRU check v2

2020-09-29 Thread Kent Russell
-by: Kent Russell --- .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c| 34 +-- 1 file changed, 23 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c index e811fecc540f..01208519f9d7 100644

[PATCH] drm/amdgpu: Use SKU instead of DID for FRU check

2020-09-29 Thread Kent Russell
The VG20 DIDs 66a0, 66a1 and 66a4 are used for various SKUs that may or may not have the FRU EEPROM on it. Parse the VBIOS to check for server SKU variants (D131 or D134) until a more general solution can be determined. Signed-off-by: Kent Russell --- .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c

[PATCH] drm/amdkfd: Use kvmalloc instead of kmalloc for VCRAT

2020-09-21 Thread Kent Russell
Since we're dynamically allocating the CPU VCRAT, use kvmalloc in case the allocation size is huge. Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers

[PATCH] drm/amdkfd: Calculate CPU VCRAT size dynamically

2020-09-18 Thread Kent Russell
Instead of guessing at a sufficient size for the CPU VCRAT, base the size on the number of online NUMA nodes. Signed-off-by: Kent Russell Change-Id: I5fb6e60f1410ad696ab78d780d0b40d01a4f829b --- drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 26 -- 1 file changed, 16 insertions

[PATCH] drm/amd/display: Explicitly set stack size to 4

2020-07-21 Thread Kent Russell
In certain kernels using GCC 8.2, we get compilation errors saying: -mpreferred-stack-boundary=3 is not between 4 and 12 Explicitly set -mpreferred-stack-boundary=4 in the Display Makefiles, even when SSE2 is enabled Change-Id: Ic7c4637e2e521af2d0444d3b5886f710131c80ca Signed-off-by: Kent Russell

[PATCH] drm/amdgpu: Fix compile warning in amdgpu_fru_read_eeprom

2020-06-29 Thread Kent Russell
This fixes the missing-prototypes warning for the amdgpu_fru_read_eeprom function. Since we declare it in the header, we can make it un-static Signed-off-by: Kent Russell Reported-by: kernel test robot Change-Id: I2b9419365cb8b38bcfb6582df53b96c83861d6cf --- drivers/gpu/drm/amd/amdgpu

[PATCH 1/2] drm/amdgpu: Add ReadSerial defines for Arcturus

2020-06-02 Thread Kent Russell
Add the ReadSerial definitions for Arcturus to the arcturus_ppsmc.h header for use with unique_id Unrevert: Supported in SMU 54.23, update values to match SMU spec Signed-off-by: Kent Russell Reviewed-by: Alex Deucher Change-Id: I9a70368ea65b898b3c26f0d57dc088f21dab9c53 --- drivers/gpu/drm

[PATCH 2/2] drm/amdgpu: Add unique_id and serial_number for Arcturus v3

2020-06-02 Thread Kent Russell
Add support for unique_id and serial_number, as these are now the same value, and will be for future ASICs as well. v2: Explicitly create unique_id only for VG10/20/ARC v3: Change set_unique_id to get_unique_id for clarity Signed-off-by: Kent Russell Change-Id

[PATCH 2/2] drm/amdgpu: Add unique_id and serial_number for Arcturus v2

2020-06-02 Thread Kent Russell
Add support for unique_id and serial_number, as these are now the same value, and will be for future ASICs as well. v2: Explicitly create unique_id only for VG10/20/ARC Signed-off-by: Kent Russell Change-Id: I3b036a38b19cd84025399b0706b2dad9b7aff713 --- drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c

[PATCH 1/2] drm/amdgpu: Add ReadSerial defines for Arcturus

2020-06-02 Thread Kent Russell
Add the ReadSerial definitions for Arcturus to the arcturus_ppsmc.h header for use with unique_id Unrevert: Supported in SMU 54.23, update values to match SMU spec Signed-off-by: Kent Russell Reviewed-by: Alex Deucher Change-Id: I9a70368ea65b898b3c26f0d57dc088f21dab9c53 --- drivers/gpu/drm

[PATCH 2/2] drm/amdgpu: Add unique_id and serial_number for Arcturus

2020-06-01 Thread Kent Russell
Add support for unique_id and serial_number, as these are now the same value, and will be for future ASICs as well. Signed-off-by: Kent Russell Change-Id: I3b036a38b19cd84025399b0706b2dad9b7aff713 --- drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c| 2 +- drivers/gpu/drm/amd/powerplay

[PATCH 1/2] drm/amdgpu: Add ReadSerial defines for Arcturus

2020-06-01 Thread Kent Russell
Add the ReadSerial definitions for Arcturus to the arcturus_ppsmc.h header for use with unique_id Unrevert: Supported in SMU 54.23, update values to match SMU spec Signed-off-by: Kent Russell Reviewed-by: Alex Deucher Change-Id: I9a70368ea65b898b3c26f0d57dc088f21dab9c53 --- drivers/gpu/drm

[PATCH 1/2] Revert "drm/amdgpu: Add unique_id for Arcturus"

2020-05-07 Thread Kent Russell
This reverts commit b3abbca4eca6091e0e657baf9a5402e204e97d4c. SMU does not support this on Arcturus right now Signed-off-by: Kent Russell Change-Id: Ia5f29f91a64005f68dbb9499b43789fe473cd00c --- drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 10 -- 1 file changed, 10 deletions(-) diff

[PATCH 2/2] Revert "drm/amdgpu: Add ReadSerial defines for Arcturus"

2020-05-07 Thread Kent Russell
This reverts commit a0d4939d8616fab676699dab8ba3cbe519d4a8e9. SMU does not support this on Arcturus right now Signed-off-by: Kent Russell Change-Id: I50efa3695570302231986d56c2351aac331726ca --- drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 2 -- drivers/gpu/drm/amd/powerplay/inc

[PATCH] drm/amdgpu: Don't report unique_id for Arcturus

2020-05-07 Thread Kent Russell
This isn't supported in the SMU yet, so just break early. This can be reverted once the SMU supports the feature Signed-off-by: Kent Russell Change-Id: I09945613aa7400afdf3f9d5dc0ffb636ee2896f7 --- drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 5 + 1 file changed, 5 insertions(+) diff

[PATCH] Revert "Revert "drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2""

2020-05-05 Thread Kent Russell
This reverts commit e71391880aa72709fac53f98d96a2d4e8875b9fa. The RAS issue at the base of this problem appears to have been addressed Signed-off-by: Kent Russell Change-Id: I338a985e19cae8e103bd44b0f238314e9460d396 --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 ++ 1

[PATCH 1/2] drm/amdgpu: Add ReadSerial defines for Arcturus

2020-04-27 Thread Kent Russell
Add the ReadSerial definitions for Arcturus to the arcturus_ppsmc.h header for use with unique_id Signed-off-by: Kent Russell Change-Id: I71849ec381730b1803e54cd6040aa3b245b53de8 --- drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 2 ++ drivers/gpu/drm/amd/powerplay/inc/arcturus_ppsmc.h

[PATCH 2/2] drm/amdgpu: Add unique_id for Arcturus

2020-04-27 Thread Kent Russell
Add support for unique_id for Arcturus, since we only have the ppsmc definitions for that added at the moment Signed-off-by: Kent Russell Change-Id: I66f8e9ff41521d6c13ff673587d6061c1f3f4b7a --- drivers/gpu/drm/amd/powerplay/arcturus_ppt.c | 10 ++ 1 file changed, 10 insertions(+) diff

[PATCH] drm/amdgpu: Disable FRU read on Arcturus

2020-04-16 Thread Kent Russell
Update the list with supported Arcturus chips, but disable for now until final list is confirmed. Ideally we can poll atombios for FRU support, instead of maintaining this list of chips, but this will enable serial number reading for supported ASICs for the time-being. Signed-off-by: Kent

[PATCH] drm/scheduler: fix drm_sched_get_cleanup_job

2020-04-14 Thread Kent Russell
From: Christian König We are racing to initialize sched->thread here, just always check the current thread. Signed-off-by: Christian Koenig Reviewed-by: Kent Russell --- drivers/gpu/drm/scheduler/sched_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/

[PATCH] Revert "drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2"

2020-04-13 Thread Kent Russell
! [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block failed -5 Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 26 -- 1 file changed, 26 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu

[PATCH] drm/amdgpu: Re-enable FRU check for most models v4

2020-04-06 Thread Kent Russell
: Don't default to true for pre-VG20 v4: Use DID instead of parsing the VBIOS Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c b/drivers/gpu

[PATCH] drm/amdgpu: Re-enable FRU check for most models v4

2020-04-06 Thread Kent Russell
DID instead of parsing the VBIOS Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c index

[PATCH] drm/amdgpu: Re-enable FRU check for most models v3

2020-04-03 Thread Kent Russell
-off-by: Kent Russell --- .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c| 22 +-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c index bfe4259f9508..508906177cad 100644

[PATCH] drm/amdgpu: Re-enable FRU check for most models v2

2020-04-03 Thread Kent Russell
There are 2 VG20 SKUs that do not have the FRU on there, and trying to read that will cause a hang. For now, check for the gaming SKU until a proper fix can be implemented. This re-enables serial number reporting for server cards v2: Add ASIC check Signed-off-by: Kent Russell --- drivers/gpu

[PATCH] drm/amdgpu: Re-enable FRU check for most models

2020-04-03 Thread Kent Russell
There are 2 VG20 SKUs that do not have the FRU on there, and trying to read that will cause a hang. For now, check for the gaming SKU until a proper fix can be implemented. This re-enables serial number reporting for server cards Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu

[PATCH] drm/amdgpu: Re-enable FRU check for most models

2020-04-03 Thread Kent Russell
There are 2 SKUs that do not have the FRU on there, and trying to read that will cause a hang. For now, check for the gaming SKU until a proper fix can be implemented. This re-enables serial number reporting for server cards Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu

[PATCH] drm/amdgpu: Expose TA FW version in fw_version file

2020-03-24 Thread Kent Russell
Reporting the fw_version just returns 0, the actual version is kept as ta_*_ucode_version. This is the same as the feature reported in the amdgpu_firmware_info debugfs file. Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 4 ++-- 1 file changed, 2 insertions(+), 2

[PATCH 1/3] drm/amdgpu: Add documentation for memory info

2020-03-20 Thread Kent Russell
Add the amdgpu.rst tie-ins for the mem_info documentation Signed-off-by: Kent Russell --- Documentation/gpu/amdgpu.rst | 41 1 file changed, 41 insertions(+) diff --git a/Documentation/gpu/amdgpu.rst b/Documentation/gpu/amdgpu.rst index d9ea09ec8e24

[PATCH 3/3] drm/amdgpu: Add documentation for unique_id

2020-03-20 Thread Kent Russell
Add the amdgpu.rst tie-ins for the unique_id documentation Signed-off-by: Kent Russell --- Documentation/gpu/amdgpu.rst | 6 ++ 1 file changed, 6 insertions(+) diff --git a/Documentation/gpu/amdgpu.rst b/Documentation/gpu/amdgpu.rst index 9afcc30e0f42..4cc74325bf91 100644

[PATCH 2/3] drm/amdgpu: Add documentation for PCIe accounting

2020-03-20 Thread Kent Russell
Add the amdgpu.rst tie-ins for the pcie accounting documentation Signed-off-by: Kent Russell --- Documentation/gpu/amdgpu.rst | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/Documentation/gpu/amdgpu.rst b/Documentation/gpu/amdgpu.rst index cb689fab94c7

[PATCH] Enable reading FRU chip via I2C v3

2020-03-19 Thread Kent Russell
documentation to amdgpu.rst, add helper functions, rename functions for consistency, fix bad starting offset v3: Remove testing definitions Signed-off-by: Kent Russell --- Documentation/gpu/amdgpu.rst | 24 +++ drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers

[PATCH] Enable reading FRU chip via I2C v2

2020-03-19 Thread Kent Russell
documentation to amdgpu.rst, add helper functions, rename functions for consistency, fix bad starting offset Signed-off-by: Kent Russell --- Documentation/gpu/amdgpu.rst | 24 +++ drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu.h

[PATCH] drm/amdgpu: Fix check for DPM when returning max clock

2020-02-25 Thread Kent Russell
is working correctly. Change-Id: I967988e936de5371c22bf92895bda22324d9631b Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgp

[PATCH] drm/powerplay: Ratelimit PP_ASSERT warnings

2020-02-12 Thread Kent Russell
: Ib817fd9227e9ffec8f1ed18c5441cbb135bc413b Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/powerplay/inc/pp_debug.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/powerplay/inc/pp_debug.h b/drivers/gpu/drm/amd/powerplay/inc/pp_debug.h index 822cd8b5bf90

[PATCH] drm/amdkfd: Remove the requirement for atomic Ops on vg20

2018-09-26 Thread Kent Russell
Signed-off-by: Kent Russell --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index b0c2afb..b505bf3 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c

  1   2   >