Am 20.10.21 um 14:55 schrieb Das, Nirmoy:
On 10/20/2021 1:51 PM, Christian König wrote:
Am 20.10.21 um 13:50 schrieb Christian König:
Am 13.10.21 um 17:09 schrieb Nirmoy Das:
GTT BO cleanup code is with in the test for loop and
we would skip cleaning up GTT BO on success.
Reported-by: z
Am 21.10.21 um 04:07 schrieb zhang:
On 2021/10/20 19:51, Christian König wrote:
Am 20.10.21 um 13:50 schrieb Christian König:
Am 13.10.21 um 17:09 schrieb Nirmoy Das:
GTT BO cleanup code is with in the test for loop and
we would skip cleaning up GTT BO on success.
Reported-by: zhang
Signe
Am 20.10.21 um 21:32 schrieb Andrey Grodzovsky:
On 2021-10-04 4:14 a.m., Christian König wrote:
The problem is a bit different.
The callback is on the dependent fence, while we need to signal the
scheduler fence.
Daniel is right that this needs an irq_work struct to handle this
properly
On 10/20/2021 10:05 PM, Kent Russell wrote:
If the bad_page_threshold kernel parameter is set to -2,
continue to post the GPU. Print a warning to dmesg that this action has
been done, and that page retirement will obviously not work for said GPU
Cc: Luben Tuikov
Cc: Mukul Joshi
Signed-off-b
[AMD Official Use Only]
Reviewed-by: Aaron Liu
--
Best Regards
Aaron Liu
> -Original Message-
> From: amd-gfx On Behalf Of Alex
> Deucher
> Sent: Wednesday, October 20, 2021 9:53 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander
> Subject: [PATCH] drm/amdgpu/display: add
On Wed, Oct 20, 2021 at 10:27 PM Huang Rui wrote:
>
> PSP firmware will be responsible for applying the GRBM CAM remapping in
> the production. And the GRBM_CAM_INDEX / GRBM_CAM_DATA registers will be
> protected by PSP under security policy. So remove it according to the
> new security policy.
>
PSP firmware will be responsible for applying the GRBM CAM remapping in
the production. And the GRBM_CAM_INDEX / GRBM_CAM_DATA registers will be
protected by PSP under security policy. So remove it according to the
new security policy.
Signed-off-by: Huang Rui
---
drivers/gpu/drm/amd/amdgpu/gfx_
On 2021/10/20 19:51, Christian König wrote:
Am 20.10.21 um 13:50 schrieb Christian König:
Am 13.10.21 um 17:09 schrieb Nirmoy Das:
GTT BO cleanup code is with in the test for loop and
we would skip cleaning up GTT BO on success.
Reported-by: zhang
Signed-off-by: Nirmoy Das
---
drivers/
Not all migrate.cpages returned from migrate_vma_setup can be migrated,
for example non anonymous page, or out of device memory. So after
migrate_vma_pages returns, add debug message to count pages are
successfully migrated which has MIGRATE_PFN_VALID and
MIGRATE_PFN_MIGRATE flag set.
Signed-off-b
cpages is only updated by migrate_vma_setup. So capture its value at
that point to clarify the significance of the number. The next patch
will add counting of actually migrated pages after migrate_vma_pages for
debug purposes.
Signed-off-by: Philip Yang
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate
Not all migrate.cpages returned from migrate_vma_setup can be migrated,
for example non anonymous page, or out of device memory. So after
migrate_vma_pages returns, check pages are successfully migrated which
has MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag set.
Signed-off-by: Philip Yang
---
On 2021-10-20 5:50 p.m., Felix Kuehling wrote:
On 2021-10-20 12:35 p.m., Kent Russell wrote:
Currently dmesg doesn't warn when the number of bad pages approaches the
threshold for page retirement. WARN when the number of bad pages
is at 90% or greater for easier checks and planning, instead of w
On 2021-10-20 17:50, Felix Kuehling wrote:
> On 2021-10-20 12:35 p.m., Kent Russell wrote:
>> Currently dmesg doesn't warn when the number of bad pages approaches the
>> threshold for page retirement. WARN when the number of bad pages
>> is at 90% or greater for easier checks and planning, instead
On 2021-10-20 17:54, Felix Kuehling wrote:
> On 2021-10-20 12:35 p.m., Kent Russell wrote:
>> If the bad_page_threshold kernel parameter is set to -2,
>> continue to post the GPU. Print a warning to dmesg that this action has
>> been done, and that page retirement will obviously not work for said G
On 2021-10-20 12:35 p.m., Kent Russell wrote:
If the bad_page_threshold kernel parameter is set to -2,
continue to post the GPU. Print a warning to dmesg that this action has
been done, and that page retirement will obviously not work for said GPU
I'd squash patch 2 and 3. The squashed patch is
On 2021-10-20 12:35 p.m., Kent Russell wrote:
Currently dmesg doesn't warn when the number of bad pages approaches the
threshold for page retirement. WARN when the number of bad pages
is at 90% or greater for easier checks and planning, instead of waiting
until the GPU is full of bad pages
Cc: L
On 2021-10-20 12:35, Kent Russell
wrote:
Currently dmesg doesn't warn when the number of bad pages approaches the
"Currently" is redundant in this sentence as it is already in
present simple tense.
threshold for page retirement. WARN
On 2021-10-04 4:14 a.m., Christian König wrote:
The problem is a bit different.
The callback is on the dependent fence, while we need to signal the
scheduler fence.
Daniel is right that this needs an irq_work struct to handle this
properly.
Christian.
So we had some discussions with Ch
On 10/20/21 18:12, Dan Williams wrote:
> On Wed, Oct 20, 2021 at 10:09 AM Joao Martins
> wrote:
>> On 10/19/21 20:21, Dan Williams wrote:
>>> On Tue, Oct 19, 2021 at 9:02 AM Jason Gunthorpe wrote:
On Tue, Oct 19, 2021 at 04:13:34PM +0100, Joao Martins wrote:
> On 10/19/21 00:06, Jason G
On Wed, Oct 20, 2021 at 10:09 AM Joao Martins wrote:
>
> On 10/19/21 20:21, Dan Williams wrote:
> > On Tue, Oct 19, 2021 at 9:02 AM Jason Gunthorpe wrote:
> >>
> >> On Tue, Oct 19, 2021 at 04:13:34PM +0100, Joao Martins wrote:
> >>> On 10/19/21 00:06, Jason Gunthorpe wrote:
> On Mon, Oct 18,
On 10/19/21 20:21, Dan Williams wrote:
> On Tue, Oct 19, 2021 at 9:02 AM Jason Gunthorpe wrote:
>>
>> On Tue, Oct 19, 2021 at 04:13:34PM +0100, Joao Martins wrote:
>>> On 10/19/21 00:06, Jason Gunthorpe wrote:
On Mon, Oct 18, 2021 at 12:37:30PM -0700, Dan Williams wrote:
>> device-da
If the bad_page_threshold kernel parameter is set to -2,
continue to post the GPU. Print a warning to dmesg that this action has
been done, and that page retirement will obviously not work for said GPU
Cc: Luben Tuikov
Cc: Mukul Joshi
Signed-off-by: Kent Russell
---
drivers/gpu/drm/amd/amdgpu/
Currently dmesg doesn't warn when the number of bad pages approaches the
threshold for page retirement. WARN when the number of bad pages
is at 90% or greater for easier checks and planning, instead of waiting
until the GPU is full of bad pages
Cc: Luben Tuikov
Cc: Mukul Joshi
Signed-off-by: Ken
When a GPU hits the bad_page_threshold, it will not be initialized by
the amdgpu driver. This means that the table cannot be cleared, nor can
information gathering be performed (getting serial number, BDF, etc).
Add an override by using amdgpu_bad_page_threshold = -2 which will still
initialize the
As it stands, we have at least two customers who are focused on having the
threshold automatically remove the GPUs from use, to ensure data integrity.
They just want warnings to know that it's getting bad (my 90% threshold patch),
so that they can plan for HW replacement accordingly.
We could
[AMD Official Use Only]
I can see both sides of the argument. Having a configurable threshold means
that you can determine what sort of "HW reliability" that you want. The default
value is likely not going to get hit by the average user. And users that DO hit
that threshold can determine if the
On 2021-10-20 09:54, Tom St Denis wrote:
> Move dpcs headers from asic_reg/dcn to asic_reg/dpcs.
>
> Update various .c files to include new path.
>
> Signed-off-by: Tom St Denis
Acked-by: Harry Wentland
Harry
> ---
> drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 4 ++--
>
On 2021-10-20 9:53 a.m., Alex Deucher wrote:
Fix revision id.
Fixes: 626cbb641f1052 ("drm/amdgpu: support B0&B1 external revision id for yellow
carp")
Signed-off-by: Alex Deucher
Reviewed-by: Nicholas Kazlauskas
Regards,
Nicholas Kazlauskas
---
drivers/gpu/drm/amd/display/include/dal_a
On 2021-10-19 16:51, Alex Deucher wrote:
> Unused. Remove it.
>
> Fixes: d1065882691179 ("Revert "drm/amd/display: Add helper for blanking all
> dp displays"")
> Signed-off-by: Alex Deucher
Reviewed-by: Harry Wentland
Harry
> ---
> drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c | 1 -
>
On 2021-10-20 09:53, Alex Deucher wrote:
> Fix revision id.
>
> Fixes: 626cbb641f1052 ("drm/amdgpu: support B0&B1 external revision id for
> yellow carp")
> Signed-off-by: Alex Deucher
Acked-by: Harry Wentland
Harry
> ---
> drivers/gpu/drm/amd/display/include/dal_asic_id.h | 2 +-
> 1 file
Move dpcs headers from asic_reg/dcn to asic_reg/dpcs.
Update various .c files to include new path.
Signed-off-by: Tom St Denis
---
drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 4 ++--
drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c | 4 ++--
drivers/gpu/drm/amd/
Fix revision id.
Fixes: 626cbb641f1052 ("drm/amdgpu: support B0&B1 external revision id for
yellow carp")
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/display/include/dal_asic_id.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/include/dal
On 10/20/2021 1:51 PM, Christian König wrote:
Am 20.10.21 um 13:50 schrieb Christian König:
Am 13.10.21 um 17:09 schrieb Nirmoy Das:
GTT BO cleanup code is with in the test for loop and
we would skip cleaning up GTT BO on success.
Reported-by: zhang
Signed-off-by: Nirmoy Das
---
driver
Am 20.10.21 um 13:50 schrieb Christian König:
Am 13.10.21 um 17:09 schrieb Nirmoy Das:
GTT BO cleanup code is with in the test for loop and
we would skip cleaning up GTT BO on success.
Reported-by: zhang
Signed-off-by: Nirmoy Das
---
drivers/gpu/drm/amd/amdgpu/amdgpu_test.c | 25 +
Am 13.10.21 um 17:09 schrieb Nirmoy Das:
GTT BO cleanup code is with in the test for loop and
we would skip cleaning up GTT BO on success.
Reported-by: zhang
Signed-off-by: Nirmoy Das
---
drivers/gpu/drm/amd/amdgpu/amdgpu_test.c | 25
1 file changed, 12 insertion
On 10/20/2021 12:49 PM, Christian König wrote:
Am 20.10.21 um 11:19 schrieb Lazar, Lijo:
On 10/20/2021 2:18 PM, Das, Nirmoy wrote:
On 10/20/2021 8:49 AM, Christian König wrote:
Am 19.10.21 um 20:14 schrieb Nirmoy Das:
Do not allow exported amdgpu_gtt_mgr_*() to accept
any ttm_resource_ma
On 10/20/2021 12:51 PM, Christian König wrote:
Am 20.10.21 um 12:21 schrieb Das, Nirmoy:
On 10/20/2021 12:15 PM, Lazar, Lijo wrote:
On 10/20/2021 3:42 PM, Das, Nirmoy wrote:
On 10/20/2021 12:03 PM, Lazar, Lijo wrote:
On 10/20/2021 3:23 PM, Das, Nirmoy wrote:
On 10/20/2021 11:11 AM
Am 19.10.21 um 19:50 schrieb Kent Russell:
When a GPU hits the bad_page_threshold, it will not be initialized by
the amdgpu driver. This means that the table cannot be cleared, nor can
information gathering be performed (getting serial number, BDF, etc).
Add an override called ignore_bad_page_thr
Am 20.10.21 um 12:21 schrieb Das, Nirmoy:
On 10/20/2021 12:15 PM, Lazar, Lijo wrote:
On 10/20/2021 3:42 PM, Das, Nirmoy wrote:
On 10/20/2021 12:03 PM, Lazar, Lijo wrote:
On 10/20/2021 3:23 PM, Das, Nirmoy wrote:
On 10/20/2021 11:11 AM, Lazar, Lijo wrote:
On 10/19/2021 11:44 PM, N
Am 20.10.21 um 11:19 schrieb Lazar, Lijo:
On 10/20/2021 2:18 PM, Das, Nirmoy wrote:
On 10/20/2021 8:49 AM, Christian König wrote:
Am 19.10.21 um 20:14 schrieb Nirmoy Das:
Do not allow exported amdgpu_gtt_mgr_*() to accept
any ttm_resource_manager pointer. Also there is no need
to force othe
On 10/20/2021 12:15 PM, Lazar, Lijo wrote:
On 10/20/2021 3:42 PM, Das, Nirmoy wrote:
On 10/20/2021 12:03 PM, Lazar, Lijo wrote:
On 10/20/2021 3:23 PM, Das, Nirmoy wrote:
On 10/20/2021 11:11 AM, Lazar, Lijo wrote:
On 10/19/2021 11:44 PM, Nirmoy Das wrote:
Get rid off pin/unpin of ga
On 10/20/2021 3:42 PM, Das, Nirmoy wrote:
On 10/20/2021 12:03 PM, Lazar, Lijo wrote:
On 10/20/2021 3:23 PM, Das, Nirmoy wrote:
On 10/20/2021 11:11 AM, Lazar, Lijo wrote:
On 10/19/2021 11:44 PM, Nirmoy Das wrote:
Get rid off pin/unpin of gart BO at resume/suspend and
instead pin only
On 10/20/2021 12:03 PM, Lazar, Lijo wrote:
On 10/20/2021 3:23 PM, Das, Nirmoy wrote:
On 10/20/2021 11:11 AM, Lazar, Lijo wrote:
On 10/19/2021 11:44 PM, Nirmoy Das wrote:
Get rid off pin/unpin of gart BO at resume/suspend and
instead pin only once and try to recover gart content
at resum
On 10/20/2021 3:23 PM, Das, Nirmoy wrote:
On 10/20/2021 11:11 AM, Lazar, Lijo wrote:
On 10/19/2021 11:44 PM, Nirmoy Das wrote:
Get rid off pin/unpin of gart BO at resume/suspend and
instead pin only once and try to recover gart content
at resume time. This is much more stable in case ther
On 10/20/2021 11:11 AM, Lazar, Lijo wrote:
On 10/19/2021 11:44 PM, Nirmoy Das wrote:
Get rid off pin/unpin of gart BO at resume/suspend and
instead pin only once and try to recover gart content
at resume time. This is much more stable in case there
is OOM situation at 2nd call to amdgpu_devi
On 10/20/2021 2:18 PM, Das, Nirmoy wrote:
On 10/20/2021 8:49 AM, Christian König wrote:
Am 19.10.21 um 20:14 schrieb Nirmoy Das:
Do not allow exported amdgpu_gtt_mgr_*() to accept
any ttm_resource_manager pointer. Also there is no need
to force other module to call a ttm function just to
ev
On 10/19/2021 11:44 PM, Nirmoy Das wrote:
Get rid off pin/unpin of gart BO at resume/suspend and
instead pin only once and try to recover gart content
at resume time. This is much more stable in case there
is OOM situation at 2nd call to amdgpu_device_evict_resources()
while evicting GART tabl
ping.
On 10/13/2021 5:09 PM, Nirmoy Das wrote:
GTT BO cleanup code is with in the test for loop and
we would skip cleaning up GTT BO on success.
Reported-by: zhang
Signed-off-by: Nirmoy Das
---
drivers/gpu/drm/amd/amdgpu/amdgpu_test.c | 25
1 file changed, 12 inser
On 10/20/2021 8:49 AM, Christian König wrote:
Am 19.10.21 um 20:14 schrieb Nirmoy Das:
Do not allow exported amdgpu_gtt_mgr_*() to accept
any ttm_resource_manager pointer. Also there is no need
to force other module to call a ttm function just to
eventually call gtt_mgr functions.
That's a r
On 10/20/2021 8:52 AM, Christian König wrote:
Am 19.10.21 um 20:14 schrieb Nirmoy Das:
Get rid off pin/unpin of gart BO at resume/suspend and
instead pin only once and try to recover gart content
at resume time. This is much more stable in case there
is OOM situation at 2nd call to amdgpu_devi
On Wed, 20 Oct 2021, Arunpravin wrote:
> - Move i915_buddy.c to drm root folder
> - Rename "i915" string with "drm" string wherever applicable
> - Rename "I915" string with "DRM" string wherever applicable
> - Fix header file dependencies
> - Fix alignment issues
>
> Signed-off-by: Arunpravin
> -
Hi
Am 20.10.21 um 00:53 schrieb Arunpravin:
- Include drm buddy to DRM root Makefile
- Add drm buddy init and exit function calls
to drm core
Is there a hard requirement to have this code in the core?
IMHO there's already too much code in the DRM core that should rather go
into helpers. T
Well please keep in mind that each patch on its own should not break
anything.
Especially patches #1, #2, #3 and #10 look like they need to be squashed
together to cleanly move the i915 code into a common place.
Christian.
Am 20.10.21 um 00:53 schrieb Arunpravin:
This series of patches impl
53 matches
Mail list logo