Re: [Intel-gfx] [PATCH] drm/i915: Always wake the device to flush the GTT

2017-08-31 Thread Daniel Vetter
On Wed, Aug 30, 2017 at 04:13:41PM +0100, Chris Wilson wrote:
> Quoting Daniel Vetter (2017-08-30 14:59:32)
> > On Wed, Aug 30, 2017 at 01:56:40PM +0100, Chris Wilson wrote:
> > > Quoting Daniel Vetter (2017-08-30 13:23:56)
> > > > Or just the need to add a pile more tests to pm_rpm?
> > > 
> > > No. It's just your regular combinatorial explosion. The approach I would
> > > take here would be to register a sysenter callback that attempted to do a
> > > rpm suspend (i.e. so ~every ioctl would start from idle, and controlled
> > > via the faultinjection framework) and then run the minimal test set that
> > > exercises all ioctl paths, and then expand to all driver branches.
> > > 
> > > First we need coverage feedback.
> > 
> > What I meant to imply: As long as any display is on we will never rpm
> > suspend. Mostly this is the case for CI machines.
> > 
> > The new testcases I've had in mind would explicitly dpms off the display
> > before running a set of gem testcases. We don't want to do that everywhere
> > though, because a dpms on/off is very costly.
> 
> If no userspace is using the display and we remove fbcon, shouldn't the
> kernel be disabling the outputs anyway? There's literally nothing there
> to provide display continuity.

Fastboot will take over boot-up state and keep it running until a kms
client takes over. Once that happened, we'll have dropped the bios fb on
the floor and will shut down all outputs on close (due to removal of the
userspace fbs).

Which means if you want to rely on that, then how your test behaves
depends upon when we last had to reboot, and whether a kms test has run
before you. That's really bad from a "too much noise in CI" pov (because
with the full sharded run it's somewhat randomized each time).

Also, we have fbcon tests in igt, so disabling fbcon isn't an option
really.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Always wake the device to flush the GTT

2017-08-30 Thread Chris Wilson
Quoting Daniel Vetter (2017-08-30 14:59:32)
> On Wed, Aug 30, 2017 at 01:56:40PM +0100, Chris Wilson wrote:
> > Quoting Daniel Vetter (2017-08-30 13:23:56)
> > > Or just the need to add a pile more tests to pm_rpm?
> > 
> > No. It's just your regular combinatorial explosion. The approach I would
> > take here would be to register a sysenter callback that attempted to do a
> > rpm suspend (i.e. so ~every ioctl would start from idle, and controlled
> > via the faultinjection framework) and then run the minimal test set that
> > exercises all ioctl paths, and then expand to all driver branches.
> > 
> > First we need coverage feedback.
> 
> What I meant to imply: As long as any display is on we will never rpm
> suspend. Mostly this is the case for CI machines.
> 
> The new testcases I've had in mind would explicitly dpms off the display
> before running a set of gem testcases. We don't want to do that everywhere
> though, because a dpms on/off is very costly.

If no userspace is using the display and we remove fbcon, shouldn't the
kernel be disabling the outputs anyway? There's literally nothing there
to provide display continuity.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Always wake the device to flush the GTT

2017-08-30 Thread Daniel Vetter
On Wed, Aug 30, 2017 at 01:56:40PM +0100, Chris Wilson wrote:
> Quoting Daniel Vetter (2017-08-30 13:23:56)
> > Or just the need to add a pile more tests to pm_rpm?
> 
> No. It's just your regular combinatorial explosion. The approach I would
> take here would be to register a sysenter callback that attempted to do a
> rpm suspend (i.e. so ~every ioctl would start from idle, and controlled
> via the faultinjection framework) and then run the minimal test set that
> exercises all ioctl paths, and then expand to all driver branches.
> 
> First we need coverage feedback.

What I meant to imply: As long as any display is on we will never rpm
suspend. Mostly this is the case for CI machines.

The new testcases I've had in mind would explicitly dpms off the display
before running a set of gem testcases. We don't want to do that everywhere
though, because a dpms on/off is very costly.

I guess once we switch (eventually, hopefully) to a binary-at-time model
for full igt CI we could amortize that over a pile of substests and do it
almost everywhere.

So I think adding a force rpm suspend won't help, because under normal CI
runs we can't ever get there becaus of the active display. And that's why
we're not hitting all these tons of problems anywhere else.

This might also be good reason for more server chips, or at least fake
server chips, where we disable the display entirely.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Always wake the device to flush the GTT

2017-08-30 Thread Chris Wilson
Quoting Chris Wilson (2017-08-30 13:56:40)
> No. It's just your regular combinatorial explosion. The approach I would
> take here would be to register a sysenter callback that attempted to do a
> rpm suspend (i.e. so ~every ioctl would start from idle, and controlled
> via the faultinjection framework) and then run the minimal test set that
> exercises all ioctl paths, and then expand to all driver branches.

The simpler option is to apply the faultinjection to the rpm_put.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Always wake the device to flush the GTT

2017-08-30 Thread Chris Wilson
Quoting Chris Wilson (2017-08-30 13:56:40)
> Quoting Daniel Vetter (2017-08-30 13:23:56)
> > Or just the need to add a pile more tests to pm_rpm?
> 
> No. It's just your regular combinatorial explosion. The approach I would
> take here would be to register a sysenter callback that attempted to do a
> rpm suspend (i.e. so ~every ioctl would start from idle, and controlled
> via the faultinjection framework) and then run the minimal test set that
> exercises all ioctl paths, and then expand to all driver branches.

We also need the corollary that without rpm it works.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Always wake the device to flush the GTT

2017-08-30 Thread Chris Wilson
Quoting Daniel Vetter (2017-08-30 13:23:56)
> On Tue, Aug 29, 2017 at 03:59:36PM +0100, Chris Wilson wrote:
> > Quoting Joonas Lahtinen (2017-08-29 15:54:06)
> > > On Tue, 2017-08-29 at 11:33 +0100, Chris Wilson wrote:
> > > > Since we hold the device wakeref when writing through the GTT (otherwise
> > > > the writes would fail), we presumed that before the device sleeps those
> > > > writes would naturally be flushed and that we wouldn't need our mmio
> > > > read trick. However, that presumption seems false and a sleepy bxt seems
> > > > to require us to always manually flush the GTT writes prior to direct
> > > > access.
> > > > 
> > > > Fixes: e2a2aa36a509 ("drm/i915: Check we have an wake device before 
> > > > flushing GTT writes")
> > > > Signed-off-by: Chris Wilson 
> > > > Cc: Joonas Lahtinen 
> > > 
> > > Got any Bugzilla, Testcase, Tested-by?
> > 
> > Original bugzilla hasn't been reopened, so I its looks like they were
> > happy enough with the original patches that fixed the problem on my bxt.
> > The testcase seems to be very system dependent, my suspicion is that it
> > has to do with the wacky runtime pm exhibited by CI bxt.
> 
> CI bxt doesn't have displays, which means we shut down a lot more when
> it's running. Does this indicate a huge gem test gap where we should run
> plenty of gem testcases with all the outputs shut down?

This one is hard to tell since we are guessing at how the hw actually
works. Strong PCI ordering it is not.

If we take this example at face value, the key point of failure was
rpm_get_if_in_use, so we could simply say that we need to ensure that
all such branches are exercised, with varying amounts of stress since we
are looking for a random hw delay.

At the moment that boils down to the shrinker being avoiding unbinding
anything whilst the device is idle, pushing us closer to oom (with
kswapd hopefully riding to the rescue, and we all know how unreliable
kswapd is).

The wacky part of CI suspend seems to be that there's no reason for the
device to wake at times, quite a few of the tests are just burning
cycles without touching the hw and we still have the constant stream of
suspend/resume. I'm very suspicious that we are waking up too often (and
that it takes too long, about 28ms including the hpd of a headless
machine).

> Or just the need to add a pile more tests to pm_rpm?

No. It's just your regular combinatorial explosion. The approach I would
take here would be to register a sysenter callback that attempted to do a
rpm suspend (i.e. so ~every ioctl would start from idle, and controlled
via the faultinjection framework) and then run the minimal test set that
exercises all ioctl paths, and then expand to all driver branches.

First we need coverage feedback.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Always wake the device to flush the GTT

2017-08-30 Thread Daniel Vetter
On Tue, Aug 29, 2017 at 03:59:36PM +0100, Chris Wilson wrote:
> Quoting Joonas Lahtinen (2017-08-29 15:54:06)
> > On Tue, 2017-08-29 at 11:33 +0100, Chris Wilson wrote:
> > > Since we hold the device wakeref when writing through the GTT (otherwise
> > > the writes would fail), we presumed that before the device sleeps those
> > > writes would naturally be flushed and that we wouldn't need our mmio
> > > read trick. However, that presumption seems false and a sleepy bxt seems
> > > to require us to always manually flush the GTT writes prior to direct
> > > access.
> > > 
> > > Fixes: e2a2aa36a509 ("drm/i915: Check we have an wake device before 
> > > flushing GTT writes")
> > > Signed-off-by: Chris Wilson 
> > > Cc: Joonas Lahtinen 
> > 
> > Got any Bugzilla, Testcase, Tested-by?
> 
> Original bugzilla hasn't been reopened, so I its looks like they were
> happy enough with the original patches that fixed the problem on my bxt.
> The testcase seems to be very system dependent, my suspicion is that it
> has to do with the wacky runtime pm exhibited by CI bxt.

CI bxt doesn't have displays, which means we shut down a lot more when
it's running. Does this indicate a huge gem test gap where we should run
plenty of gem testcases with all the outputs shut down?

Or just the need to add a pile more tests to pm_rpm?

Would be good if testcase review is a part of review, and not just "code
does what the commit message says" ... The latter should be the
trivial-most part of review really.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Always wake the device to flush the GTT

2017-08-29 Thread Chris Wilson
Quoting Joonas Lahtinen (2017-08-29 15:54:06)
> On Tue, 2017-08-29 at 11:33 +0100, Chris Wilson wrote:
> > Since we hold the device wakeref when writing through the GTT (otherwise
> > the writes would fail), we presumed that before the device sleeps those
> > writes would naturally be flushed and that we wouldn't need our mmio
> > read trick. However, that presumption seems false and a sleepy bxt seems
> > to require us to always manually flush the GTT writes prior to direct
> > access.
> > 
> > Fixes: e2a2aa36a509 ("drm/i915: Check we have an wake device before 
> > flushing GTT writes")
> > Signed-off-by: Chris Wilson 
> > Cc: Joonas Lahtinen 
> 
> Got any Bugzilla, Testcase, Tested-by?

Original bugzilla hasn't been reopened, so I its looks like they were
happy enough with the original patches that fixed the problem on my bxt.
The testcase seems to be very system dependent, my suspicion is that it
has to do with the wacky runtime pm exhibited by CI bxt.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Always wake the device to flush the GTT

2017-08-29 Thread Joonas Lahtinen
On Tue, 2017-08-29 at 11:33 +0100, Chris Wilson wrote:
> Since we hold the device wakeref when writing through the GTT (otherwise
> the writes would fail), we presumed that before the device sleeps those
> writes would naturally be flushed and that we wouldn't need our mmio
> read trick. However, that presumption seems false and a sleepy bxt seems
> to require us to always manually flush the GTT writes prior to direct
> access.
> 
> Fixes: e2a2aa36a509 ("drm/i915: Check we have an wake device before flushing 
> GTT writes")
> Signed-off-by: Chris Wilson 
> Cc: Joonas Lahtinen 

Got any Bugzilla, Testcase, Tested-by?

Does what the message describes.

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Always wake the device to flush the GTT

2017-08-29 Thread Chris Wilson
Since we hold the device wakeref when writing through the GTT (otherwise
the writes would fail), we presumed that before the device sleeps those
writes would naturally be flushed and that we wouldn't need our mmio
read trick. However, that presumption seems false and a sleepy bxt seems
to require us to always manually flush the GTT writes prior to direct
access.

Fixes: e2a2aa36a509 ("drm/i915: Check we have an wake device before flushing 
GTT writes")
Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 43834dee4e8d..890fe2802973 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -695,12 +695,11 @@ flush_write_domain(struct drm_i915_gem_object *obj, 
unsigned int flush_domains)
switch (obj->base.write_domain) {
case I915_GEM_DOMAIN_GTT:
if (INTEL_GEN(dev_priv) >= 6 && !HAS_LLC(dev_priv)) {
-   if (intel_runtime_pm_get_if_in_use(dev_priv)) {
-   spin_lock_irq(&dev_priv->uncore.lock);
-   
POSTING_READ_FW(RING_ACTHD(dev_priv->engine[RCS]->mmio_base));
-   spin_unlock_irq(&dev_priv->uncore.lock);
-   intel_runtime_pm_put(dev_priv);
-   }
+   intel_runtime_pm_get(dev_priv);
+   spin_lock_irq(&dev_priv->uncore.lock);
+   
POSTING_READ_FW(RING_ACTHD(dev_priv->engine[RCS]->mmio_base));
+   spin_unlock_irq(&dev_priv->uncore.lock);
+   intel_runtime_pm_put(dev_priv);
}
 
intel_fb_obj_flush(obj,
-- 
2.14.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Always wake the device to flush the GTT

2017-08-28 Thread Chris Wilson
Experimental patch for bxt/gem_pwrite.
---
 drivers/gpu/drm/i915/i915_gem.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 37fbc64d9ffe..4f6af401320c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -695,12 +695,11 @@ flush_write_domain(struct drm_i915_gem_object *obj, 
unsigned int flush_domains)
switch (obj->base.write_domain) {
case I915_GEM_DOMAIN_GTT:
if (INTEL_GEN(dev_priv) >= 6 && !HAS_LLC(dev_priv)) {
-   if (intel_runtime_pm_get_if_in_use(dev_priv)) {
-   spin_lock_irq(&dev_priv->uncore.lock);
-   
POSTING_READ_FW(RING_ACTHD(dev_priv->engine[RCS]->mmio_base));
-   spin_unlock_irq(&dev_priv->uncore.lock);
-   intel_runtime_pm_put(dev_priv);
-   }
+   intel_runtime_pm_get(dev_priv);
+   spin_lock_irq(&dev_priv->uncore.lock);
+   
POSTING_READ_FW(RING_ACTHD(dev_priv->engine[RCS]->mmio_base));
+   spin_unlock_irq(&dev_priv->uncore.lock);
+   intel_runtime_pm_put(dev_priv);
}
 
intel_fb_obj_flush(obj,
-- 
2.14.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx