[PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset v4

2020-05-13 Thread jianzh
From: Jiange Zhao When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is due, amdgpu would notify usermode app through

[pull] amdgpu, amdkfd drm-fixes-5.7

2020-05-13 Thread Alex Deucher
Hi Dave, Daniel, Fixes for 5.7. The following changes since commit a9fe6f18cde03c20facbf75dc910a372c1c1025b: Merge tag 'drm-misc-fixes-2020-05-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes (2020-05-08 15:04:25 +1000) are available in the Git repository at:

RE: [PATCH] drm/amdgpu: Updated XGMI power down control support check

2020-05-13 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Hawking Zhang Regards, Hawking From: Clements, John Sent: Thursday, May 14, 2020 11:23 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking Subject: [PATCH] drm/amdgpu: Updated XGMI power down control support check [AMD Official

[PATCH] drm/amdgpu: Updated XGMI power down control support check

2020-05-13 Thread Clements, John
[AMD Official Use Only - Internal Distribution Only] Updated SMC FW version check to determine if XGMI power down control is supported 0001-drm-amdgpu-Updated-XGMI-power-down-control-support-c.patch Description: 0001-drm-amdgpu-Updated-XGMI-power-down-control-support-c.patch

Re: [PATCH] drm/amdkfd: Provide SMI events watch

2020-05-13 Thread Felix Kuehling
Am 2020-05-13 um 3:41 p.m. schrieb Amber Lin: > When the compute is malfunctioning or performance drops, the system admin > will use SMI (System Management Interface) tool to monitor/diagnostic what > went wrong. This patch provides an event watch interface for the user > space to register devices

[PATCH] drm/amdkfd: Provide SMI events watch

2020-05-13 Thread Amber Lin
When the compute is malfunctioning or performance drops, the system admin will use SMI (System Management Interface) tool to monitor/diagnostic what went wrong. This patch provides an event watch interface for the user space to register devices and subscribe events they are interested. After

Re: [PATCH 0/6] RFC Support hot device unplug in amdgpu

2020-05-13 Thread Daniel Vetter
On Wed, May 13, 2020 at 10:32:56AM -0400, Andrey Grodzovsky wrote: > > On 5/11/20 5:54 AM, Daniel Vetter wrote: > > On Sat, May 09, 2020 at 02:51:44PM -0400, Andrey Grodzovsky wrote: > > > This RFC is a more of a proof of concept then a fully working solution as > > > there are a few unresolved

Re: Reg. Adaptive Sync feature in xf86-video-amdgpu

2020-05-13 Thread uday kiran pichika
Hello Michel and Team, Can you please help to provide the below details on the Adaptive Sync verification? 1. As you have mentioned in IRC, AMD has verified on Ubuntu machine with Unity/Compiz Compositor. But when i see mesa/src/util/00-mesa-defaults.conf where Mutter and Compiz compositors are

Re: [PATCH 0/6] RFC Support hot device unplug in amdgpu

2020-05-13 Thread Andrey Grodzovsky
On 5/11/20 5:54 AM, Daniel Vetter wrote: On Sat, May 09, 2020 at 02:51:44PM -0400, Andrey Grodzovsky wrote: This RFC is a more of a proof of concept then a fully working solution as there are a few unresolved issues we are hopping to get advise on from people on the mailing list. Until now

[PATCH v5 20/38] drm: radeon: fix common struct sg_table related issues

2020-05-13 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function returns the number of the created entries in the DMA address space. However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with the original number of the entries passed to the

[PATCH v5 07/38] drm: amdgpu: fix common struct sg_table related issues

2020-05-13 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function returns the number of the created entries in the DMA address space. However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with the original number of the entries passed to the

Re: [PATCH 1/2] drm/radeon: disable AGP by default

2020-05-13 Thread Mathieu Malaterre
On Wed, May 13, 2020 at 1:21 PM Christian König wrote: > > Always use the PCI GART instead. Reviewed-by: Mathieu Malaterre > Signed-off-by: Christian König > --- > drivers/gpu/drm/radeon/radeon_drv.c | 5 - > 1 file changed, 5 deletions(-) > > diff --git

Re: [PATCH 2/2] drm/ttm: deprecate AGP support

2020-05-13 Thread Christian König
Am 13.05.20 um 14:34 schrieb Daniel Vetter: On Wed, May 13, 2020 at 01:03:13PM +0200, Christian König wrote: Even when core AGP support is compiled in Radeon and Nouveau can also work with the PCI GART. The AGP support was notorious unstable and hard to maintain, so deprecate it for now and

Re: [PATCH 2/2] drm/ttm: deprecate AGP support

2020-05-13 Thread Daniel Vetter
On Wed, May 13, 2020 at 01:03:13PM +0200, Christian König wrote: > Even when core AGP support is compiled in Radeon and > Nouveau can also work with the PCI GART. > > The AGP support was notorious unstable and hard to > maintain, so deprecate it for now and only enable it if > there is a good

[RFC] Deprecate AGP GART support for Radeon/Nouveau/TTM

2020-05-13 Thread Christian König
Unfortunately AGP is still to widely used as we could just drop support for using its GART. Not using the AGP GART also doesn't mean a loss in functionality since drivers will just fallback to the driver specific PCI GART. For now just deprecate the code and don't enable the AGP GART in TTM

[PATCH 2/2] drm/ttm: deprecate AGP support

2020-05-13 Thread Christian König
Even when core AGP support is compiled in Radeon and Nouveau can also work with the PCI GART. The AGP support was notorious unstable and hard to maintain, so deprecate it for now and only enable it if there is a good reason to do so. Signed-off-by: Christian König --- drivers/gpu/drm/Kconfig

[PATCH 1/2] drm/radeon: disable AGP by default

2020-05-13 Thread Christian König
Always use the PCI GART instead. Signed-off-by: Christian König --- drivers/gpu/drm/radeon/radeon_drv.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index bbb0883e8ce6..a71f13116d6b 100644 ---

Re: [Nouveau] [RFC] Remove AGP support from Radeon/Nouveau/TTM

2020-05-13 Thread Thomas Zimmermann
Hi Am 13.05.20 um 11:27 schrieb Emil Velikov: > On Tue, 12 May 2020 at 20:48, Alex Deucher wrote: > > > There's some AGP support code in the DRM core. Can some of that declared > as legacy? > > Specifically, what about these AGP-related ioctl calls? Can they be >

Re: [Nouveau] [RFC] Remove AGP support from Radeon/Nouveau/TTM

2020-05-13 Thread Thomas Zimmermann
Hi Am 11.05.20 um 19:17 schrieb Christian König: > Hi guys, > > Well let's face it AGP is a total headache to maintain and dead for at least > 10+ years. > > We have a lot of x86 specific stuff in the architecture independent graphics > memory management to get the caching right, abusing the

Re: [PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset v4

2020-05-13 Thread Christian König
Since usermode app might open a file , do nothing and close it. That case is unproblematic since closing the debugfs file sets the state of the struct completion to completed again no matter if we waited or not. But when you don't reset in the open() callback we open a small windows between

RE: [PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset v4

2020-05-13 Thread Zhao, Jiange
[AMD Official Use Only - Internal Distribution Only] Hi Christian, Since amdgpu_debugfs_wait_dump() would need 'audodump.dumping.done==0' to actually stop and wait for user mode app to dump. Since usermode app might open a file , do nothing and close it. I believe a poll() function would be a

Re: [Nouveau] [RFC] Remove AGP support from Radeon/Nouveau/TTM

2020-05-13 Thread Emil Velikov
On Tue, 12 May 2020 at 20:48, Alex Deucher wrote: > > >> > > >> There's some AGP support code in the DRM core. Can some of that declared > > >> as legacy? > > >> > > >> Specifically, what about these AGP-related ioctl calls? Can they be > > >> declared as legacy? It would appear to me that

Re: [RFC 02/17] dma-fence: basic lockdep annotations

2020-05-13 Thread Daniel Vetter
On Tue, May 12, 2020 at 11:19 AM Chris Wilson wrote: > Quoting Daniel Vetter (2020-05-12 10:08:47) > > On Tue, May 12, 2020 at 10:04:22AM +0100, Chris Wilson wrote: > > > Quoting Daniel Vetter (2020-05-12 09:59:29) > > > > Design is similar to the lockdep annotations for workers, but with > > > >

Re: [PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset v4

2020-05-13 Thread Christian König
Thanks for the reminder, had to much todo yesterday and just forgot about it. Christian. Am 13.05.20 um 10:16 schrieb Zhao, Jiange: [AMD Official Use Only - Internal Distribution Only] Hi @Koenig, Christian , I made some changes on top of version 3 and

Re: [PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset v4

2020-05-13 Thread Christian König
Am 09.05.20 um 11:45 schrieb jia...@amd.com: From: Jiange Zhao When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is

Re: [PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset v4

2020-05-13 Thread Zhao, Jiange
[AMD Official Use Only - Internal Distribution Only] Hi @Koenig, Christian, I made some changes on top of version 3 and tested it. Can you help review? Jiange From: Zhao, Jiange Sent: Saturday, May 9, 2020 5:45 PM To:

Re: [Nouveau] [PATCH 1/3] drm/radeon: remove AGP support

2020-05-13 Thread Michel Dänzer
On 2020-05-13 9:46 a.m., Christian König wrote: > Am 12.05.20 um 23:12 schrieb Alex Deucher: >> On Tue, May 12, 2020 at 4:52 PM Roy Spliet wrote: >>> >>> I'll volunteer to be the one asking: how big is this performance >>> difference? Have any benchmarks been run before and after removal of AGP

Re: [Nouveau] [PATCH 1/3] drm/radeon: remove AGP support

2020-05-13 Thread Christian König
Am 12.05.20 um 23:12 schrieb Alex Deucher: On Tue, May 12, 2020 at 4:52 PM Roy Spliet wrote: Op 12-05-2020 om 14:36 schreef Alex Deucher: On Tue, May 12, 2020 at 4:16 AM Michel Dänzer wrote: On 2020-05-11 10:12 p.m., Alex Deucher wrote: On Mon, May 11, 2020 at 1:17 PM Christian König

Re: [RFC 09/17] drm/amdgpu: use dma-fence annotations in cs_submit()

2020-05-13 Thread Daniel Vetter
On Wed, May 13, 2020 at 9:02 AM Christian König wrote: > > Am 12.05.20 um 10:59 schrieb Daniel Vetter: > > This is a bit tricky, since ->notifier_lock is held while calling > > dma_fence_wait we must ensure that also the read side (i.e. > > dma_fence_begin_signalling) is on the same side. If we

Re: [RFC 09/17] drm/amdgpu: use dma-fence annotations in cs_submit()

2020-05-13 Thread Christian König
Am 12.05.20 um 10:59 schrieb Daniel Vetter: This is a bit tricky, since ->notifier_lock is held while calling dma_fence_wait we must ensure that also the read side (i.e. dma_fence_begin_signalling) is on the same side. If we mix this up lockdep complaints, and that's again why we want to have

Re: [RFC 16/17] drm/amdgpu: gpu recovery does full modesets

2020-05-13 Thread Daniel Vetter
On Tue, May 12, 2020 at 10:10 PM Kazlauskas, Nicholas wrote: > > On 2020-05-12 12:12 p.m., Daniel Vetter wrote: > > On Tue, May 12, 2020 at 4:24 PM Alex Deucher wrote: > >> > >> On Tue, May 12, 2020 at 9:45 AM Daniel Vetter > >> wrote: > >>> > >>> On Tue, May 12, 2020 at 3:29 PM Alex Deucher