Dieter Nützel wrote:
> Am Samstag, 27. Mai 2006 19:13 schrieben Sie:
>> Rune Petersen wrote:
>>> The patch makes radeonWaitIrq() try again if -EBUSY is returned.
>>>
>>> This fixes benchmark 3 & 4 in progs/demos/gltestperf
>>>
>>> Benchmark: 3 - ZSmooth Tex Blend Triangles
>>> Benchmark: 4 - ZSmooth Tex Blend TMesh Triangles
>>>
>>> Not an important app, but other apps might run in to this problem.
>>>
>>>
>>>  static void radeonWaitIrq(radeonContextPtr radeon)
>>>  {
>>>     int ret;
>>> +   int tries = 5;
>>>
>>>     do {
>>>             ret = drmCommandWrite(radeon->dri.fd, DRM_RADEON_IRQ_WAIT,
>>>                                   &radeon->iw, sizeof(radeon->iw));
>>> -   } while (ret && (errno == EINTR || errno == EAGAIN));
>>> +   } while (ret && (errno == EINTR || errno == EAGAIN || (errno == EBUSY
>>> && --tries)));
>> Hmm, interesting. This problem does not appear to be r300 specific,
>> radeon/r200 use the same code (haven't seen problems with it, though).
> 
> I had it for ages with the r200 on my dual Athlon MP2800 and now with the 
> Duron system, too.
> Reported it from time to time.
Ah right. Didn't remember that it was the same test.
Indeed, a quick look would suggest that you'd ALWAYS get some some abort 
when rendering something which would take more than 3 seconds (and then 
do something which causes that drmwaitforirq call). (This only works if 
you can queue up your rendering work in the buffers you can get, 
otherwise you'd wait more than once but for shorter periods of time - 
but this seems easy it looks).

> [r200] gltestperf - some progress
> snip
> Benchmark: 4
> ZSmooth Tex Blend TMesh Triangles
> Current size: 400
> Elapsed time for the calibration test (4880): 2.000000
> Selected number of benchmark iterations: 12200
> r200WaitIrq: drmRadeonIrqWait: -16
> 
>> That said, it looks to me like that ioctl will actually never return
>> EAGAIN, maybe the original intention was to just wait indefinitely on
>> EBUSY instead of EAGAIN?
>> (e.g.  while (ret && (errno == EINTR || errno == EBUSY));)
> 
> Any idea?
Well, retrying on EBUSY instead of EAGAIN would fix that. The behaviour 
quite sucks though, so while the chip hasn't locked up the system is 
just completely unresponsive during that period (and as gltestperf 
shows, it can be long...). I'm not sure how you'd solve that problem 
really though, since you don't know how much time the rendering will 
consume in advance, and there is little you can do once you've started 
it. Though looking at the code, it seems radeondrmWaitIrq is never 
called when the lock is held, so I'd think that retrying on EBUSY 
(instead of EAGAIN) really is the correct solution and should be safe 
(the system should not be completely unresponsive for the whole time, 
after every 3s it should be somewhat responsive).
I've just tried that however, and it doesn't quite work neither, at some 
point xorg ddx thought that the chip had locked up (e.g. got some FIFO 
timed out messages) and reseted the chip which unsurprisingly gltestperf 
didn't like (it did actually not crash but just stop, I think it depends 
a bit on your luck what happens).
Looks like quite some more work is needed to detect real lockups but not 
  just randomly reseting the chip when there is none (which can itself 
lead to lockups IME).

Roland


--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to