Re: [Nouveau] [PATCH 5.10 32/77] drm/ttm: fix memleak in ttm_transfered_destroy
On 2021-11-03 21:32 +0100, Karol Herbst wrote: > On Wed, Nov 3, 2021 at 9:29 PM Karol Herbst wrote: >> >> On Wed, Nov 3, 2021 at 8:52 PM Sven Joachim wrote: >> > >> > On 2021-11-01 10:17 +0100, Greg Kroah-Hartman wrote: >> > >> > > From: Christian König >> > > >> > > commit 0db55f9a1bafbe3dac750ea669de9134922389b5 upstream. >> > > >> > > We need to cleanup the fences for ghost objects as well. >> > > >> > > Signed-off-by: Christian König >> > > Reported-by: Erhard F. >> > > Tested-by: Erhard F. >> > > Reviewed-by: Huang Rui >> > > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214029 >> > > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214447 >> > > CC: >> > > Link: >> > > https://patchwork.freedesktop.org/patch/msgid/20211020173211.2247-1-christian.koe...@amd.com >> > > Signed-off-by: Greg Kroah-Hartman >> > > --- >> > > drivers/gpu/drm/ttm/ttm_bo_util.c |1 + >> > > 1 file changed, 1 insertion(+) >> > > >> > > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c >> > > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c >> > > @@ -322,6 +322,7 @@ static void ttm_transfered_destroy(struc >> > > struct ttm_transfer_obj *fbo; >> > > >> > > fbo = container_of(bo, struct ttm_transfer_obj, base); >> > > + dma_resv_fini(>base.base._resv); >> > > ttm_bo_put(fbo->bo); >> > > kfree(fbo); >> > > } >> > >> > Alas, this innocuous looking commit causes one of my systems to lock up >> > as soon as run startx. This happens with the nouveau driver, two other >> > systems with radeon and intel graphics are not affected. Also I only >> > noticed it in 5.10.77. Kernels 5.15 and 5.14.16 are not affected, and I >> > do not use 5.4 anymore. >> > >> > I am not familiar with nouveau's ttm management and what has changed >> > there between 5.10 and 5.14, but maybe one of their developers can shed >> > a light on this. >> > >> > Cheers, >> >Sven >> > >> >> could be related to 265ec0dd1a0d18f4114f62c0d4a794bb4e729bc1 > > maybe not.. but I did remember there being a few tmm related patches > which only hurt nouveau :/ I guess one could do a git bisect to > figure out what change "fixes" it. Maybe, but since the memory leaks reported by Erhard only started to show up in 5.14 (if I read the bugzilla reports correctly), perhaps the patch should simply be reverted on earlier kernels? > On which GPU do you see this problem? On an old GeForce 8500 GT, the whole PC is rather ancient. Cheers, Sven
Re: [Nouveau] [PATCH 5.10 32/77] drm/ttm: fix memleak in ttm_transfered_destroy
On 2021-11-01 10:17 +0100, Greg Kroah-Hartman wrote: > From: Christian König > > commit 0db55f9a1bafbe3dac750ea669de9134922389b5 upstream. > > We need to cleanup the fences for ghost objects as well. > > Signed-off-by: Christian König > Reported-by: Erhard F. > Tested-by: Erhard F. > Reviewed-by: Huang Rui > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214029 > Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214447 > CC: > Link: > https://patchwork.freedesktop.org/patch/msgid/20211020173211.2247-1-christian.koe...@amd.com > Signed-off-by: Greg Kroah-Hartman > --- > drivers/gpu/drm/ttm/ttm_bo_util.c |1 + > 1 file changed, 1 insertion(+) > > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c > @@ -322,6 +322,7 @@ static void ttm_transfered_destroy(struc > struct ttm_transfer_obj *fbo; > > fbo = container_of(bo, struct ttm_transfer_obj, base); > + dma_resv_fini(>base.base._resv); > ttm_bo_put(fbo->bo); > kfree(fbo); > } Alas, this innocuous looking commit causes one of my systems to lock up as soon as run startx. This happens with the nouveau driver, two other systems with radeon and intel graphics are not affected. Also I only noticed it in 5.10.77. Kernels 5.15 and 5.14.16 are not affected, and I do not use 5.4 anymore. I am not familiar with nouveau's ttm management and what has changed there between 5.10 and 5.14, but maybe one of their developers can shed a light on this. Cheers, Sven
Re: [Nouveau] nouveau xorg driver - compile error
On 2014-07-01 21:47 +0200, Pali Rohár wrote: Hello, nouveau xorg driver from master git repository cannot be compiled on ubuntu precise. Here is build log: https://launchpadlibrarian.net/179062836/buildlog_ubuntu-precise-amd64.xserver-xorg-video-nouveau_1:1.0.10-git201407010817~ubuntu12.04.1_FAILEDTOBUILD.txt.gz Problem is somewhere in drmmode_display.c. Can somebody look at it and fix compile error problems? Don't know how best to fix it, but the problem is that xorg_list only exists in X server 1.12 and later, Ubuntu precise has 1.11. Cheers, Sven ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [REGRESSION] system does not resume from ram due to commit drm/nv50/fifo: prevent races between clients updating playlists
On 2013-05-26 23:09 +0200, Maarten Maathuis wrote: My NV96 does not resume from suspend to ram (the screen stays black, magic sysrq keys do work) with the current linus git kernel, i bisected it to the following commit. drm/nv50/fifo: prevent races between clients updating playlists b5096566f6e1ee2b88324772f020ae9bc0cfa9a0 It's not obvious to me how this causes problems, but reverting this commit does solve my problem. Same here on my NV86. Cheers, Sven ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] Fix for vblank on nvc0
On 2012-11-12 22:52 +0100, Maarten Lankhorst wrote: Op 12-11-12 22:30, Kelly Doran schreef: I had Sven test this patch... he said it works. I think chipset number test was executing code that we thought was only either 0x50 or 0xc0, but was actually more specific with things like 0x92. Oh right vblank is busted anyway... needs to be nv_device(priv-)card_type == NV_50 to work. Your patch is the correct fix. The patch has landed in the nouveau/master branch now, but not in drm-nouveau-fixes where it is also needed. Cheers, Sven ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [bisected] nouveau: Failed to idle channel x after resume
On 2012-08-08 08:18 +0200, Sven Joachim wrote: On 2012-08-08 08:08 +0200, Ben Skeggs wrote: On Wed, Aug 08, 2012 at 08:00:21AM +0200, Sven Joachim wrote: Not for me on my GeForce 8500 GT, and I still cannot suspend more than once, subsequent attempts fail: , | Aug 8 07:49:16 turtle kernel: [ 91.697068] nouveau W[ | PGRAPH][:01:00.0][0x0200502d][880037be1d40] parent failed | suspend, -16 | Aug 8 07:49:16 turtle kernel: [ 91.697078] nouveau [ DRM][:01:00.0] resuming display... ` Interesting. Were there any messages prior to that? Nothing interesting: , | Aug 8 07:49:16 turtle kernel: [ 89.655362] nouveau [ DRM][:01:00.0] suspending fbcon... | Aug 8 07:49:16 turtle kernel: [ 89.655367] nouveau [ DRM][:01:00.0] suspending display... | Aug 8 07:49:16 turtle kernel: [ 89.696888] nouveau [ DRM][:01:00.0] unpinning framebuffer(s)... | Aug 8 07:49:16 turtle kernel: [ 89.696909] nouveau [ DRM][:01:00.0] evicting buffers... | Aug 8 07:49:16 turtle kernel: [ 89.696913] nouveau [ DRM][:01:00.0] suspending client object trees... ` I guess the the fifo code detected a timeout when trying to save the graphics context, I have I have other patches in my tree (I'll push them soon, tied up with other work atm) that might help here. Thanks, I'll try them when they are available. With current nouveau master (drm/nouveau: fix find/replace bug in license header) suspending works again, thanks! However, it is a bit slow, taking between two and five seconds: , | Aug 13 18:17:56 turtle kernel: [ 678.524814] PM: Syncing filesystems ... done. | Aug 13 18:18:09 turtle kernel: [ 678.639202] Freezing user space processes ... (elapsed 0.01 seconds) done. | Aug 13 18:18:09 turtle kernel: [ 678.649954] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. | Aug 13 18:18:09 turtle kernel: [ 678.663298] Suspending console(s) (use no_console_suspend to debug) | Aug 13 18:18:09 turtle kernel: [ 678.680884] sd 0:0:0:0: [sda] Synchronizing SCSI cache | Aug 13 18:18:09 turtle kernel: [ 678.681000] sd 0:0:0:0: [sda] Stopping disk | Aug 13 18:18:09 turtle kernel: [ 678.695141] parport_pc 00:07: disabled | Aug 13 18:18:09 turtle kernel: [ 678.695204] serial 00:06: disabled | Aug 13 18:18:09 turtle kernel: [ 678.695209] serial 00:06: wake-up capability disabled by ACPI | Aug 13 18:18:09 turtle kernel: [ 678.695235] nouveau [ DRM][:01:00.0] suspending fbcon... | Aug 13 18:18:09 turtle kernel: [ 678.695239] nouveau [ DRM][:01:00.0] suspending display... | Aug 13 18:18:09 turtle kernel: [ 678.742111] nouveau [ DRM][:01:00.0] unpinning framebuffer(s)... | Aug 13 18:18:09 turtle kernel: [ 678.742189] nouveau [ DRM][:01:00.0] evicting buffers... | Aug 13 18:18:09 turtle kernel: [ 682.357319] nouveau [ DRM][:01:00.0] suspending client object trees... | Aug 13 18:18:09 turtle kernel: [ 683.526646] PM: suspend of devices complete after 4863.181 msecs ` With the 3.4.8 kernel, suspending takes little more than one second. Cheers, Sven ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [bisected] nouveau: Failed to idle channel x after resume
On 2012-08-08 07:37 +0200, Ben Skeggs wrote: On Mon, Aug 06, 2012 at 11:38:04PM +0300, Maxim Levitsky wrote: On Sat, 2012-08-04 at 17:41 +0300, Maxim Levitsky wrote: On Mon, 2012-07-23 at 18:25 +0300, Aioanei Rares wrote: On Thu, Jul 5, 2012 at 11:24 PM, Martin Nyhus martin.ny...@gmx.com wrote: On Mon, 11 Jun 2012 23:18:42 +0200 Martin Nyhus wrote: after resuming from suspend nouveau starts writing Failed to idle channel x (where x is 2 or 3) to the log and X appears to stop and then restart only to stop again. Starting Firefox after resuming triggers the bugs every time, and bisecting leads to 03bd6efa (drm/nv50/fifo: use hardware channel kickoff functionality). Hi Ben, I'm still seeing this bug with the latest from Linus (v3.5-rc5-98-g9e85a6f) and linux-next (next-20120705). lspci output: 01:00.0 VGA compatible controller: nVidia Corporation G86 [GeForce 8400M GS] (rev a1) Sorry I haven't followed up on this earlier, Martin I can confirm this with 3.5.0, Chromium and Arch Linux. It's a HP Pavilion laptop with a G86 [GeForce 8400 M GS] video card . Seems related to this bug: http://lists.freedesktop.org/archives/nouveau/2011-January/007358.html . If I can do anything else to help, I will be glad to. Added nouveau@lists.freedesktop.org I confirm the same issue here. will try to do dig it. Nope,can't dig this :-( Interestingly, this works just fine for me after the driver rework. Not for me on my GeForce 8500 GT, and I still cannot suspend more than once, subsequent attempts fail: , | Aug 8 07:49:16 turtle kernel: [ 91.697068] nouveau W[ PGRAPH][:01:00.0][0x0200502d][880037be1d40] parent failed suspend, -16 | Aug 8 07:49:16 turtle kernel: [ 91.697078] nouveau [ DRM][:01:00.0] resuming display... ` I can confirm issues on G86 with 3.5/3.6-rc1 stock though. I'll attempt to find a fix suitable for the non-reworked driver. Thanks. I'm currently stuck on 3.4 because of this problem. Cheers, Sven ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] [bisected] nouveau: Failed to idle channel x after resume
On 2012-08-08 08:08 +0200, Ben Skeggs wrote: On Wed, Aug 08, 2012 at 08:00:21AM +0200, Sven Joachim wrote: Not for me on my GeForce 8500 GT, and I still cannot suspend more than once, subsequent attempts fail: , | Aug 8 07:49:16 turtle kernel: [ 91.697068] nouveau W[ | PGRAPH][:01:00.0][0x0200502d][880037be1d40] parent failed | suspend, -16 | Aug 8 07:49:16 turtle kernel: [ 91.697078] nouveau [ DRM][:01:00.0] resuming display... ` Interesting. Were there any messages prior to that? Nothing interesting: , | Aug 8 07:49:16 turtle kernel: [ 89.655362] nouveau [ DRM][:01:00.0] suspending fbcon... | Aug 8 07:49:16 turtle kernel: [ 89.655367] nouveau [ DRM][:01:00.0] suspending display... | Aug 8 07:49:16 turtle kernel: [ 89.696888] nouveau [ DRM][:01:00.0] unpinning framebuffer(s)... | Aug 8 07:49:16 turtle kernel: [ 89.696909] nouveau [ DRM][:01:00.0] evicting buffers... | Aug 8 07:49:16 turtle kernel: [ 89.696913] nouveau [ DRM][:01:00.0] suspending client object trees... ` I guess the the fifo code detected a timeout when trying to save the graphics context, I have I have other patches in my tree (I'll push them soon, tied up with other work atm) that might help here. Thanks, I'll try them when they are available. Cheers, Sven ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] ../nouveau_drv.so: undefined symbol: RegionEmptyData
On 2010-12-29 19:40 +0100, Morten Bo Johansen wrote: I get the above error message when trying to run the latest xf86-video-nouveau driver from the git repository. It prevents the driver from loading. If I revert back to the Debian version 1:0.0.15+git20100329+7858345-5 of that driver, then the driver loads, but there are many issues with that version that I hoped an upgrade could fix. Information about my hardware and software: GPU: GigaByte GV-N68128DH Kernel: 2.6.37 with the latest Nouveau git tree Drm: latest git version Nouveau: latest git version xserver-xorg-core: 1.7.7-10 xserver-xorg-dev: 1.9.0.902-1 You need to install xserver-xorg-core from Debian experimental as well, 1.7.7 is too old. Sven ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau