On Thu, May 03, 2018 at 09:58:56AM -0700, Rodrigo Vivi wrote:
> On Wed, May 02, 2018 at 03:31:15PM -0700, Tarun Vyas wrote:
> > On Wed, May 02, 2018 at 01:04:06PM -0700, Vivi, Rodrigo wrote:
> > > On Wed, May 02, 2018 at 09:51:43PM +0300, Ville Syrjälä wrote:
> > > > On Wed, May 02, 2018 at 11:19:14AM -0700, Tarun Vyas wrote:
> > > > > On Mon, Apr 30, 2018 at 10:19:33AM -0700, Rodrigo Vivi wrote:
> > > > > > On Sun, Apr 29, 2018 at 09:00:18PM -0700, Tarun Vyas wrote:
> > > > > > > From: Tarun <tarun.v...@intel.com>
> > > > > > >
> > > > > > > The PIPEDSL freezes on PSR entry and if PSR hasn't fully exited, 
> > > > > > > then
> > > > > > > the pipe_update_start call schedules itself out to check back 
> > > > > > > later.
> > > > > > >
> > > > > > > On ChromeOS-4.4 kernel, which is fairly up-to-date w.r.t drm/i915 
> > > > > > > but
> > > > > > > lags w.r.t core kernel code, hot plugging an external display 
> > > > > > > triggers
> > > > > > > tons of "potential atomic update errors" in the dmesg, on *pipe 
> > > > > > > A*. A
> > > > > > > closer analysis reveals that we try to read the scanline 3 times 
> > > > > > > and
> > > > > > > eventually timeout, b/c PSR hasn't exited fully leading to a 
> > > > > > > PIPEDSL
> > > > > > > stuck @ 1599. This issue is not seen on upstream kernels, b/c for 
> > > > > > > *some*
> > > > > > > reason we loop inside intel_pipe_update start for ~2+ msec which 
> > > > > > > in this
> > > > > > > case is more than enough to exit PSR fully, hence an *unstuck* 
> > > > > > > PIPEDSL
> > > > > > > counter, hence no error. On the other hand, the ChromeOS kernel 
> > > > > > > spends
> > > > > > > ~1.1 msec looping inside intel_pipe_update_start and hence errors 
> > > > > > > out
> > > > > > > b/c the source is still in PSR.
> > > > > > >
> > > > > > > Regardless, we should wait for PSR exit (if PSR is supported and 
> > > > > > > active
> > > > > > > on the current pipe) before reading the PIPEDSL, b/c if we haven't
> > > > > > > fully exited PSR, then checking for vblank evasion isn't actually
> > > > > > > applicable.
> > > > > > >
> > > > > > > This scenario applies to a configuration with an additional pipe,
> > > > > > > as of now.
> > > > > >
> > > > > > I honestly believe you picking the wrong culprit here. By 
> > > > > > "coincidence".
> > > > > > PSR will allow DC state with screen on and DC state will mess up 
> > > > > > with all
> > > > > > registers reads....
> > > > > >
> > > > > > probably what you are missing you your kernel is some power domain
> > > > > > grab that would keep DC_OFF and consequently a sane read of these
> > > > > > registers.
> > > > > >
> > > > > > Maybe Imre has a quick idea of what you could be missing on your 
> > > > > > kernel
> > > > > > that we already have on upstream one.
> > > > > >
> > > > > > Thanks,
> > > > > > Rodrigo.
> > > > > >
> > > > > Thanks for the quick response Rodrigo !
> > > > > Some key observations based on my experiments so far:
> > > > >        for (;;) {
> > > > >                 /*
> > > > >                  * prepare_to_wait() has a memory barrier, which 
> > > > > guarantees
> > > > >                  * other CPUs can see the task state update by the 
> > > > > time we
> > > > >                  * read the scanline.
> > > > >                  */
> > > > >                 prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);
> > > > >
> > > > >                 scanline = intel_get_crtc_scanline(crtc);
> > > > >                 if (scanline < min || scanline > max)
> > > > >                         break;
> > > > >
> > > > >                 if (timeout <= 0) {
> > > > >                         DRM_ERROR("Potential atomic update failure on 
> > > > > pipe %c\n",
> > > > >                                   pipe_name(crtc->pipe));
> > > > >                         break;
> > > > >                 }
> > > > >
> > > > >                 local_irq_enable();
> > > > >
> > > > >                 timeout = schedule_timeout(timeout);
> > > > >
> > > > >                 local_irq_disable();
> > > > >         }
> > > > > 1. In the above loop inside pipe_update_start, the *first time*, we 
> > > > > read the PIPEDSL, with PSR1 and external display connected, it always 
> > > > > reads 1599, for *both* the kernels(upstream and ChromeOS-4.4) . The 
> > > > > PSR_STATUS also reads the exact same for *both* kernels and shows 
> > > > > that we haven't *fully* exited PSR.
> > > > >
> > > > > 2. The difference between the two kernels comes after this first read 
> > > > > of the PIPEDSL. ChromeOS-4.4 spends ~1 msec inside that loop and 
> > > > > upstream spends ~2msec. I suspect that it is because of the 
> > > > > scheduling changes between the two kernels, b/c I can't find any i915 
> > > > > specific code running in that loop, except for vblank processing.
> > > > >
> > > > > 3. So to summarize it, both the kernels are in the same state w.r.t 
> > > > > PSR and PIPEDSL value when they read the PIPEDSL for the first time 
> > > > > inside the loop. *When* the kernels *transition* to a *full PSR exit* 
> > > > > is what is differing.
> > > 
> > > Oh! So you really are getting reliable counters....
> > > 
> > > > >
> > > > > My rationale for this patch is that, the pipe_update_start function 
> > > > > is meant to evade 100 usec before a vblank, but, *if* we haven't 
> > > > > *fully* exited PSR (which is true for both the kernels for the first 
> > > > > PIPEDSL read), then vblank evasion is *not applicable* b/c the 
> > > > > PIPEDSL will be messed up. So we shouldn't bother evading vblank 
> > > > > until we have fully exited PSR.
> > > >
> > > > Yeah, I think this is the right direction. The problem really is the
> > > > extra vblank pulse that the hardware generates (or at least can
> > > > generate depending on a chicken bit) when it exits PSR. We have no
> > > > control over when that happens and hence we have no control over when
> > > > the registers get latched. And yet we still have to somehow prevent
> > > > the register latching from occurring while we're in middle of
> > > > reprogramming them.
> > > 
> > > I see the problem now. Thanks for the explanation.
> > > 
> > > >
> > > > There are a couple of ways to avoid this:
> > > > 1) Set the chicken bit so that we don't get the vblank pulse. The
> > > >    pipe should restart from the vblank start, so we would have one
> > > >    full frame to reprogam the registers. Howver IIRC DK told me
> > > >    there is no way to fully eliminate it in all cases so this
> > > >    option is probably out. There was also some implication for FBC
> > > >    which I already forgot.
> > > > 2) Make sure we've exited PSR before repgrogamming the registers
> > > >    (ie. what you do).
> > > > 3) Use the DOUBLE_BUFFER_CTL to prevent the extra vblank pulse from
> > > >    latching the registers while we're still reprogramming them.
> > > >    This feature only exists on SKL+ so is not a solution for
> > > >    HSW/BDW. But maybe HSW/BDW didn't even have the extra vblank
> > > >    pulse?
> > > >
> > > > Option 2) does provide a consistent behaviour on all platforms, so I
> > > > do kinda like it. It also avoids a bigger reword on account of the
> > > > DOUBLE_BUFFER_CTL. I do think we'll have to start using
> > > > DOUBLE_BUFFER_CTL anyway due to other issues, but at least this way
> > > > we don't block PSR progress on that work.
> > > 
> > > My vote is for the option 2. Seems more straighforward and more broad.
> > > 
> > > DK?
> > > 
> > > My only request on the patch itself would be to create a function
> > > on intel_psr.c intel_psr_wait_for_idle... or something like this
> > > and put the register wait logic inside it instead of spreading
> > > the psr code around.
> > > 
> > > Thanks,
> > > Rodrigo.
> > > 
> > > >
> > > > --
> > > > Ville Syrjälä
> > > > Intel
> > Thanks for the comments, Ville and Rodrigo. I'll rework this to move the 
> > wait to intel_psr.c. There is a psr_wait_for_idle() in there, but there are 
> > some PSR locks being passed around inside it (eventually released by the 
> > caller). Also,the max timeout specified there is 50 msec which might be way 
> > too much ?
> 
> ouch! that function is ugly.... unlock than lock back again...
> (specially unlock without any assert locked... :/)
> 
> If you can improve that or split in a way that we reuse some code it would be 
> nice...
>
Yes, I was also considering splitting it up. I'll rework and send it out. 
Thanks !

-Tarun 
> >
> > Best,
> > Tarun
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to