* Jon Smirl <[EMAIL PROTECTED]> wrote:

> A better solution might be to loop twenty times to pick up the very
> short commands. After 20us switch to a loop that allows the kernel to
> schedule. You don't want to immediately schedule since that will kill
> graphics performance.

You can busy-loop just fine as long as you stay preemptible. Also, if
need_resched() is true then you should get out of your locks ASAP.

the _ability_ to preempt quickly should not affect 3D performance in any
noticeable way: if the only app running is a 3D app, or it has a high
priority (because the user wants good 3D performance) then it will
perform just fine. Conversely, if an audio app has high priority then it
must see low latencies. So this is not an act of balancing, this is an
act of _enabling_ low latencies. Looping in the kernel for even 1 msec
while holding locks is very, very bad and should be avoided at every
cost. In fact you should shoot for as quick preemptability as possible
whenever you are looping for 3D completion, because your code is not
doing anything productive in fact. 'quick preemptability' != scheduling
away, it only means 'scheduling away if need_resched() is set'.

> What's the right way to write a loop like this that meets the above
> requirements and also satisfies the audio needs?

>         for ( i = 0 ; i < dev_priv->usec_timeout ; i++ ) {
>                 if ( !(RADEON_READ( RADEON_RBBM_STATUS )
>                        & RADEON_RBBM_ACTIVE) ) {
>                         radeon_do_pixcache_flush( dev_priv );
>                         return 0;
>                 }
>                 DRM_UDELAY( 1 );

add:

        if (need_resched()) {
                break locks ...
                cond_resched();
                reacquire locks ...
        }

and make sure the breaking of the locks is safe at that point (no other
3D application should be able to interfere with your polling, etc.).

this will solve all the audio complaints.

as a bonus, if you want to be 'nice' to lower prio processes as well,
you can also do:

        if ((i & 127) == 127) {
                drop locks ...
                msleep(1);
                reqacquire locks ...
        }

this will busy-loop up to 127 usecs, and will sleep for 1 msec
afterwards.

the cleanest and highest performing solution would be to have an
interrupt source for 3D completion - do you have such an interrupt
handler handy? If yes then you can sleep precisely up to completion:

        if ((i & 127) == 127) {
                drop locks ...
                wait_event(&driver->wait_queue,
                        RADEON_READ(RADEON_RBBM_STATUS));
                reqacquire locks ...
                break;
        }

and in the IRQ handler, do a:

        wake_up(&driver->wait_queue);

the wakeup doesnt even have to be precise - i.e. you can wake up early
or for all events - wait_event()'s second field, the 'is the wait done'
takes care of it and re-suspends the task if the 3D engine is still
busy.

(to implement a timeout and signal-awareness as well you can use
wait_event_interruptible_timeout().)

        Ingo


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
--
_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to