On Fri, 27 Oct 2023 16:23:24 +0200
Danilo Krummrich <d...@redhat.com> wrote:

> On 10/27/23 10:25, Boris Brezillon wrote:
> > Hi Danilo,
> > 
> > On Thu, 26 Oct 2023 18:13:00 +0200
> > Danilo Krummrich <d...@redhat.com> wrote:
> >   
> >> Currently, job flow control is implemented simply by limiting the number
> >> of jobs in flight. Therefore, a scheduler is initialized with a credit
> >> limit that corresponds to the number of jobs which can be sent to the
> >> hardware.
> >>
> >> This implies that for each job, drivers need to account for the maximum
> >> job size possible in order to not overflow the ring buffer.
> >>
> >> However, there are drivers, such as Nouveau, where the job size has a
> >> rather large range. For such drivers it can easily happen that job
> >> submissions not even filling the ring by 1% can block subsequent
> >> submissions, which, in the worst case, can lead to the ring run dry.
> >>
> >> In order to overcome this issue, allow for tracking the actual job size
> >> instead of the number of jobs. Therefore, add a field to track a job's
> >> credit count, which represents the number of credits a job contributes
> >> to the scheduler's credit limit.
> >>
> >> Signed-off-by: Danilo Krummrich <d...@redhat.com>
> >> ---
> >> Changes in V2:
> >> ==============
> >>    - fixed up influence on scheduling fairness due to consideration of a 
> >> job's
> >>      size
> >>      - If we reach a ready entity in drm_sched_select_entity() but can't 
> >> actually
> >>        queue a job from it due to size limitations, just give up and go to 
> >> sleep
> >>        until woken up due to a pending job finishing, rather than continue 
> >> to try
> >>        other entities.
> >>    - added a callback to dynamically update a job's credits (Boris)  
> > 
> > This callback seems controversial. I'd suggest dropping it, so the
> > patch can be merged.  
> 
> I don't think we should drop it just for the sake of moving forward. If there 
> are objections
> we'll discuss them. I've seen good reasons why the drivers you are working on 
> require this,
> while, following the discussion, I have *not* seen any reasons to drop it. It 
> helps simplifying
> driver code and doesn't introduce any complexity or overhead to existing 
> drivers.

Up to you. I'm just saying, moving one step in the right direction is
better than being stuck, and it's not like adding this callback in a
follow-up patch is super complicated either. If you're confident that
we can get all parties to agree on keeping this hook, fine by me.

Reply via email to