On Wed, Mar 11, 2015 at 02:53:39PM +0000, John Harrison wrote:
> On 05/03/2015 14:49, Daniel Vetter wrote:
> >On Thu, Mar 05, 2015 at 01:57:31PM +0000, john.c.harri...@intel.com wrote:
> >>From: John Harrison <john.c.harri...@intel.com>
> >>
> >>The LRC submission code requires a request for tracking purposes. It does 
> >>not
> >>actually require that request to 'complete' it simply uses it for keeping 
> >>hold
> >>of reference counts on contexts and such like.
> >>
> >>In the case where the ring buffer is completely full, the LRC code looks 
> >>for a
> >>pending request that would free up sufficient space upon completion and 
> >>waits
> >>for it. If no such request can be found it resorts to simply polling the 
> >>free
> >>space count until it is big enough. This situation should only occur when 
> >>the
> >>entire buffer is filled with the request currently being generated. I.e., 
> >>the
> >>user is trying to submit a single piece of work that is large than the ring
> >>buffer itself (which should be impossible because very large batch buffers 
> >>don't
> >>consume any more ring buffer space). Before starting to poll, a submit call 
> >>is
> >>made to make sure that the currently queued up work in the buffer will 
> >>actually
> >>be submtted and thus the poll will eventually succeed.
> >>
> >>The problem here is that the 'official' request cannot be used as that could
> >>lead to multiple LRC submissions being tagged to a single request structure.
> >>Instead, the code fakes up a private request structure and uses that.
> >>
> >>This patch moves the faked request allocation higher up in the call stack 
> >>to the
> >>wait code itself (rather than being at the very lowest submission level). 
> >>Thus
> >>it is now obvious where the faked request is coming from and why it is
> >>necessary. The patch also replaces it with a call to the official request
> >>allocation code rather than attempting to duplicate that code. This becomes
> >>especially important in the future when the request allocation changes to
> >>accommodate a conversion to struct fence.
> >>
> >>For: VIZ-5115
> >>Signed-off-by: John Harrison <john.c.harri...@intel.com>
> >This is only possible if you pile up tons of olr. Since your patch series
> >fixes this design issue by removing olr I think we can just put a WARN_ON
> >in here if this ever happens and bail out with -ELOSTMYMARBLES or
> >something. And then rip out all this complexity.
> >
> >Or do I miss something important?
> >-Daniel
> 
> Yeah, you missed the extremely important bug in the free space calculation
> that meant this impossible code path was being hit on a regular basis. The
> LRC wait_request code differed from the legacy wait_request code in the the
> latter was updated with request->postfix changes and the former was not.
> Thus the LRC one would happily find a request that frees up enough space,
> wait on it, retire it and then find there was still not enough space!
> 
> New patches to fix the space calculation bug and to completely remove the
> polling path will be forth coming...

Hm, Jesse did dig out a regression where gem_ringfill blows up on
execlist. That igt is specifically to exercise this cornercases. I'm not
sure whith bugzilla Jesse meant, there's two:

https://bugs.freedesktop.org/show_bug.cgi?id=89001
https://bugs.freedesktop.org/show_bug.cgi?id=88865

Can you please check whether your fixes help for those issues two?

Also adding Jesse since he's chasing these atm.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to