Re: drm_sched run_job and scheduling latency

Chia-I Wu Thu, 05 Mar 2026 21:46:41 -0800

On Thu, Mar 5, 2026 at 3:10 PM Hillf Danton <[email protected]> wrote:
>
> On Wed, Mar 04, 2026 at 02:51:39PM -0800, Chia-I Wu wrote:
> > Hi,
> >
> > Our system compositor (surfaceflinger on android) submits gpu jobs
> > from a SCHED_FIFO thread to an RT gpu queue. However, because
> > workqueue threads are SCHED_NORMAL, the scheduling latency from submit
> > to run_job can sometimes cause frame misses. We are seeing this on
> > panthor and xe, but the issue should be common to all drm_sched users.
> >
> > Using a WQ_HIGHPRI workqueue helps, but it is still not RT (and won't
> > meet future android requirements). It seems either workqueue needs to
> > gain RT support, or drm_sched needs to support kthread_worker.
> >
> As RT means (in general) to some extent that the game of eevdf is played in
> __userspace__, but you are not PeterZ, so any issue like frame miss is
> understandably expected.
> Who made the workqueue worker a victim if the CPU cycles are not tight?
> Who is the new victim of a RT kthread worker?
> As RT is not free, what did you pay for it, given fewer RT success on market?
That is a deliberate decision for android, that avoiding frame misses
is a top priority.


Also, I think most drm drivers already signal their fences from irq
handlers or rt threads for a similar reason. And the reasoning applies
to submissions as well.

Re: drm_sched run_job and scheduling latency

Reply via email to