Re: nouveau 0000:01:00.0: drm_WARN_ON(!found_head)

2023-12-13 Thread Lyude Paul
Nevermind - I don't think I'll need the logs, I stared at the code for long
enough and I think I realized what's happening.

I will have a patch for you to test in just a moment, just waiting for it to
compile so I can verify nothing else breaks

On Wed, 2023-12-13 at 18:48 -0500, Lyude Paul wrote:
> Hopefully you're still on at this point - if you are, could you try starting
> the machine up with the following kernel module arguments passed to nouveau?
> 
> debug=disp=trace
> 
> Then see if you can find any lines that mention INHERIT? I have a feeling I'm
> just going to have to add a workaround for the time being, but I'd really love
> to know how we're managing to get that far on a hardware generation we never
> implemented that nvkm ioctl for…
> 
> On Wed, 2023-12-13 at 18:37 -0500, Lyude Paul wrote:
> > agh - thank you for repeatedly poking on this, I've been busy enough with 
> > GSP
> > work I totally missed this. Yes - I'm quite surprised that this is blowing 
> > up,
> > but considering that looks to be a GT218 I guess display state readback must
> > just work a bit differently there since that's really early on into the NV50
> > days.
> > 
> > The reason that was a drm_WARN_ON() was because it indicates that we're not
> > reading back OR -> head assignments properly. But, I'm confused how we're 
> > even
> > getting that far on a non-GSP platform. I'm going to dig into this now, but 
> > if
> > I don't figure out a good fix by the end of the day I'll just send a patch 
> > to
> > silent the warning.
> > 
> > Thanks again for bugging me about this!
> > 
> > On Wed, 2023-12-13 at 13:49 +0100, Borislav Petkov wrote:
> > > On Wed, Dec 13, 2023 at 12:39:36PM +0100, Borislav Petkov wrote:
> > > > We're getting close to releasing so I guess we either debug this or shut
> > > > up the WARN.
> > > 
> > > Not only that - panic_on_warn turns this into an explosion so you don't
> > > want that in a released kernel.
> > > 
> > 
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat



Re: nouveau 0000:01:00.0: drm_WARN_ON(!found_head)

2023-12-13 Thread Lyude Paul
Hopefully you're still on at this point - if you are, could you try starting
the machine up with the following kernel module arguments passed to nouveau?

debug=disp=trace

Then see if you can find any lines that mention INHERIT? I have a feeling I'm
just going to have to add a workaround for the time being, but I'd really love
to know how we're managing to get that far on a hardware generation we never
implemented that nvkm ioctl for…

On Wed, 2023-12-13 at 18:37 -0500, Lyude Paul wrote:
> agh - thank you for repeatedly poking on this, I've been busy enough with GSP
> work I totally missed this. Yes - I'm quite surprised that this is blowing up,
> but considering that looks to be a GT218 I guess display state readback must
> just work a bit differently there since that's really early on into the NV50
> days.
> 
> The reason that was a drm_WARN_ON() was because it indicates that we're not
> reading back OR -> head assignments properly. But, I'm confused how we're even
> getting that far on a non-GSP platform. I'm going to dig into this now, but if
> I don't figure out a good fix by the end of the day I'll just send a patch to
> silent the warning.
> 
> Thanks again for bugging me about this!
> 
> On Wed, 2023-12-13 at 13:49 +0100, Borislav Petkov wrote:
> > On Wed, Dec 13, 2023 at 12:39:36PM +0100, Borislav Petkov wrote:
> > > We're getting close to releasing so I guess we either debug this or shut
> > > up the WARN.
> > 
> > Not only that - panic_on_warn turns this into an explosion so you don't
> > want that in a released kernel.
> > 
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat



Re: nouveau 0000:01:00.0: drm_WARN_ON(!found_head)

2023-12-13 Thread Lyude Paul
agh - thank you for repeatedly poking on this, I've been busy enough with GSP
work I totally missed this. Yes - I'm quite surprised that this is blowing up,
but considering that looks to be a GT218 I guess display state readback must
just work a bit differently there since that's really early on into the NV50
days.

The reason that was a drm_WARN_ON() was because it indicates that we're not
reading back OR -> head assignments properly. But, I'm confused how we're even
getting that far on a non-GSP platform. I'm going to dig into this now, but if
I don't figure out a good fix by the end of the day I'll just send a patch to
silent the warning.

Thanks again for bugging me about this!

On Wed, 2023-12-13 at 13:49 +0100, Borislav Petkov wrote:
> On Wed, Dec 13, 2023 at 12:39:36PM +0100, Borislav Petkov wrote:
> > We're getting close to releasing so I guess we either debug this or shut
> > up the WARN.
> 
> Not only that - panic_on_warn turns this into an explosion so you don't
> want that in a released kernel.
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat



Re: nouveau 0000:01:00.0: drm_WARN_ON(!found_head)

2023-12-13 Thread Borislav Petkov
On Wed, Dec 13, 2023 at 12:39:36PM +0100, Borislav Petkov wrote:
> We're getting close to releasing so I guess we either debug this or shut
> up the WARN.

Not only that - panic_on_warn turns this into an explosion so you don't
want that in a released kernel.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: nouveau 0000:01:00.0: drm_WARN_ON(!found_head)

2023-12-13 Thread Borislav Petkov
On Tue, Dec 12, 2023 at 10:35:51PM -0500, Paul Dufresne wrote:
> https://gitlab.freedesktop.org/drm/nouveau/-/issues/282

Let's add more folks who were involved in

1b477f42285e ("drm/nouveau/kms: Add INHERIT ioctl to nvkm/nvif for reading IOR 
state")

Apparently, someone wants to know that the loop over the crtcs in
nv50_display_read_hw_or_state() didn't find a head.

Holler if you need me to run a debug patch to figure out why.

We're getting close to releasing so I guess we either debug this or shut
up the WARN.

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: nouveau 0000:01:00.0: drm_WARN_ON(!found_head)

2023-12-13 Thread Paul Dufresne
https://gitlab.freedesktop.org/drm/nouveau/-/issues/282

Re: nouveau 0000:01:00.0: drm_WARN_ON(!found_head)

2023-12-12 Thread Borislav Petkov
On Sat, Nov 11, 2023 at 01:03:23PM +0100, Borislav Petkov wrote:
> Hi,
> 
> this is ontop of Linus' tree from the 4th (lemme know if I should try
> the latest) on one of my test boxes:
> 
> nouveau :01:00.0: vgaarb: deactivate vga console
> Console: switching to colour dummy device 80x25
> nouveau :01:00.0: NVIDIA GT218 (0a8280b1)
> CE: hpet increased min_delta_ns to 20115 nsec
> nouveau :01:00.0: bios: version 70.18.49.00.00
> nouveau :01:00.0: fb: 1024 MiB DDR3
> nouveau :01:00.0: DRM: VRAM: 1024 MiB
> nouveau :01:00.0: DRM: GART: 1048576 MiB
> nouveau :01:00.0: DRM: TMDS table version 2.0
> nouveau :01:00.0: DRM: MM: using COPY for buffer copies
> [ cut here ]
> nouveau :01:00.0: drm_WARN_ON(!found_head)
> WARNING: CPU: 4 PID: 786 at drivers/gpu/drm/nouveau/dispnv50/disp.c:2731 
> nv50_display_init+0x28c/0x4f0 [nouveau]
> Modules linked in: nouveau(+) drm_ttm_helper ttm video drm_exec drm_gpuvm 
> gpu_sched drm_display_helper wmi
> CPU: 4 PID: 786 Comm: systemd-udevd Not tainted 6.6.0+ #1

This still fires on -rc5:

[4.577348] nouveau :01:00.0: vgaarb: deactivate vga console
[4.584482] Console: switching to colour dummy device 80x25
[4.590120] nouveau :01:00.0: NVIDIA GT218 (0a8280b1)
[4.718171] nouveau :01:00.0: bios: version 70.18.49.00.00
[4.724788] nouveau :01:00.0: fb: 1024 MiB DDR3
[6.047984] nouveau :01:00.0: DRM: VRAM: 1024 MiB
[6.053031] nouveau :01:00.0: DRM: GART: 1048576 MiB
[6.058340] nouveau :01:00.0: DRM: TMDS table version 2.0
[6.065892] nouveau :01:00.0: DRM: MM: using COPY for buffer copies
[6.078375] [ cut here ]
[6.082994] nouveau :01:00.0: drm_WARN_ON(!found_head)
[6.083023] WARNING: CPU: 3 PID: 779 at 
drivers/gpu/drm/nouveau/dispnv50/disp.c:2731 nv50_display_init+0x28c/0x4f0 
[nouve
au]
[6.099800] Modules linked in: nouveau(+) drm_ttm_helper ttm video drm_exec 
drm_gpuvm gpu_sched drm_display_helper wmi
[6.110490] CPU: 3 PID: 779 Comm: systemd-udevd Not tainted 6.7.0-rc5+ #2
[6.117272] Hardware name: MICRO-STAR INTERNATIONAL CO.,LTD MS-7599/870-C45 
(MS-7599), BIOS V1.15 03/04/2011
[6.127087] RIP: 0010:nv50_display_init+0x28c/0x4f0 [nouveau]
[6.132915] Code: 4c 8b 6f 50 4d 85 ed 75 03 4c 8b 2f e8 cd 16 37 e1 48 c7 
c1 4c 55 2d a0 48 89 c6 4c 89 ea 48 c7 c7 42 5
5 2d a0 e8 44 5a e8 e0 <0f> 0b 48 8b 43 08 49 39 c6 48 8d 58 f8 0f 85 41 ff ff 
ff 48 8d 7c
[6.151660] RSP: 0018:c936ba98 EFLAGS: 00010286
[6.156885] RAX: 002e RBX: 8881009fbc00 RCX: 
[6.164013] RDX: 0002 RSI: c936b9b0 RDI: 0001
[6.171141] RBP: 888103fc8ad0 R08: 888136ffdfe8 R09: 0058
[6.178263] R10: 027a R11: 888136401b70 R12: 888103fc8800
[6.185393] R13: 888100abddf0 R14: 888103fc8ab0 R15: 
[6.192521] FS:  7fdc144858c0() GS:88812f4c() 
knlGS:
[6.200601] CS:  0010 DS:  ES:  CR0: 80050033
[6.206339] CR2: 55676cc01000 CR3: 000103f6c000 CR4: 06f0
[6.213466] Call Trace:
[6.215921]  
[6.218015]  ? __warn+0x96/0x160
[6.221240]  ? nv50_display_init+0x28c/0x4f0 [nouveau]
[6.226461]  ? report_bug+0x1ec/0x200
[6.230119]  ? handle_bug+0x3c/0x70
[6.233611]  ? exc_invalid_op+0x1f/0x90
[6.237442]  ? asm_exc_invalid_op+0x16/0x20
[6.241622]  ? nv50_display_init+0x28c/0x4f0 [nouveau]
[6.246840]  ? nv50_display_init+0x28c/0x4f0 [nouveau]
[6.252058]  ? sched_set_fifo+0x46/0x60
[6.255897]  nouveau_display_init+0xa0/0xd0 [nouveau]
[6.261031]  nouveau_drm_device_init+0x42a/0x990 [nouveau]
[6.266604]  nouveau_drm_probe+0x105/0x240 [nouveau]
[6.271651]  ? __pm_runtime_resume+0x68/0xa0
[6.275920]  pci_device_probe+0xaa/0x140
[6.279840]  really_probe+0xc2/0x2d0
[6.283411]  __driver_probe_device+0x73/0x120
[6.287761]  driver_probe_device+0x2c/0xb0
[6.291851]  __driver_attach+0xa0/0x150
[6.295683]  ? __device_attach_driver+0xc0/0xc0
[6.300205]  bus_for_each_dev+0x67/0xa0
[6.304044]  bus_add_driver+0x10e/0x210
[6.307874]  driver_register+0x5c/0x120
[6.311706]  ? 0xa0336000
[6.315017]  do_one_initcall+0x44/0x200
[6.318851]  ? kmalloc_trace+0x37/0xc0
[6.322595]  do_init_module+0x64/0x230
[6.326344]  init_module_from_file+0x8d/0xd0
[6.330609]  idempotent_init_module+0x15a/0x210
[6.335136]  __x64_sys_finit_module+0x67/0xb0
[6.339490]  do_syscall_64+0x41/0xf0
[6.343066]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[6.348118] RIP: 0033:0x7fdc14947ee9
[6.351691] Code: 08 44 89 e0 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 48 89 
f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 
f0 ff ff 73 01 c3 48 8b 0d f7 ee 0e 00 f7 d8 64 89 01 48
[6.370433] RSP: