Re: r300 radeon 9800 lockup

Nicolai Haehnle Tue, 24 May 2005 10:06:48 -0700

On Tuesday 24 May 2005 18:33, Adam K Kirchhoff wrote:
> Vladimir Dergachev wrote:
> 
> >> Vladimir Dergachev wrote:
> >>
> >>>
> >>> In the past I found useful not to turn drm debugging on, but, rather,
> >>> insert printk statements in various place in radeon code. This should
> >>> also provide more information about what is actually going on. 
> >>
> >>
> >>
> >> I can't make any promises.  My partner already thinks I spend too much
> >> time in front of the computer :-)  I'll see what I can do, though.  
> >> Think a
> >> printk statement at the start and end of every function?  Have any
> >
> >
> > This is probably overkill and might not be very useful
> >
> > Rather try, at first, to just print a printk logging which command is 
> > being executed (r300_cmd.c) - this is not very thorough, but, maybe, 
> > there is a pattern.
> 
> 
> I added a printk for each function in r300_cmdbuf.c...  When Q3A locked 
> up, and the last thing showing up in syslog was r300_pacify.  So I added 
> printk's after every line in r300_pacify :-)  The last thing in syslog 
was:
> 
> OUT_RING( CP_PACKET3( RADEON_CP_NOP, 0 ) )
> OUT_RING( 0x0 )
> ADVANCE_RING()
> 
> So it seems to be making it all the way through r300_pacify, which had 
> been called from r300_check_range, from r300_emit_unchecked_state.
> 
> Here's the sequence:
> r300_emit_raw
> r300_emit_packet3
> r300_emit_raw
> r300_emit_unchecked_state
> r300_check_range
> r300_emit_unchecked_state
> r300_check_range
> r300_pacify
> RING_LOCALS
> BEGIN_RING(6)
> OUT_RING( CP_PACKET0( R300_RB3D_DSTCACHE_CTLSTAT, 0 ) )
> OUT_RING( 0xa )
> OUT_RING( CP_PACKET0( 0x4f18, 0 ) )
> OUT_RING( 0x3 )
> OUT_RING( CP_PACKET3( RADEON_CP_NOP, 0 ) )
> OUT_RING( 0x0 )
> ADVANCE_RING()
>                                                          
> 
> Does this tell us anything?


Unfortunately, I don't think so. The thing is, all those OUT_RING and 
ADVANCE_RING commands do not really call into the hardware immediately; all 
they do is write stuff to the ring buffer, but the ring buffer is just some 
memory area without any magic of its own.

Only a call to COMMIT_RING will tell the hardware that new commands are 
waiting in the ring buffer, and the only thing we do know is that 
*something* in the ring buffer before the last COMMIT_RING causes the chip 
to hang.

So another possible way to investigate this could be:
- Call radeon_do_wait_for_idle() at the end of the COMMIT_RING macro, and 
define RADEON_FIFO_DEBUG (this will print out additional information when 
wait_for_idle fails)
- Increasingly add COMMIT_RING macros into r300_cmdbuf.c to pinpoint the 
exact location of the problem, if at all possible.

It would be very helpful if you could single out one command we send using 
this procedure.

Note that in the worst case (depending on the actual nature of the lockup in 
hardware), those debugging changes could actually *remove* the lockup (e.g. 
because they remove a race condition that caused the lockup in the first 
place).

cu,
Nicolai

pgp3gtAZqqVTh.pgp
Description: PGP signature

Re: r300 radeon 9800 lockup

Reply via email to