Re: [Intel-gfx] [PATCH] drm/i915: Only report a wakeup if the waiter was truly asleep
On 11/12/2017 17:08, Chris Wilson wrote: Quoting Tvrtko Ursulin (2017-12-11 16:10:49) On 09/12/2017 12:47, Chris Wilson wrote: If we attempt to wake up a waiter, who is currently checking the seqno it will be in the TASK_INTERRUPTIBLE state and ttwu will report success. However, it is actually awake and functioning -- so delay reporting the actual wake up until it sleeps. This fixes some spurious claims of missed_breadcrumbs when running under heavy load; i.e. sufficient load to preempt away the newly woken waiter before they complete their checks. However, it does so at the cost of a rare false negative; where the waiter changes between the check and ttwu -- the only way to fix that would be to extend the reporting from ttwu where the check could be done atomically. v2: Defend against !CONFIG_SMP v3: Don't filter out calls to wake_up_process Testcase: igt/drv_missed_irq # sanity check we do detect missed_breadcrumb() Testcase: igt/gem_concurrent_blit # for generating false positives References: https://bugs.freedesktop.org/show_bug.cgi?id=17 Signed-off-by: Chris WilsonCc: Tvrtko Ursulin Cc: Joonas Lahtinen --- drivers/gpu/drm/i915/intel_breadcrumbs.c | 39 1 file changed, 30 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c index 24c6fefdd0b1..76e6f8e7cfd4 100644 --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c @@ -27,6 +27,12 @@ #include "i915_drv.h" +#ifdef CONFIG_SMP +#define task_asleep(tsk) ((tsk)->state & TASK_NORMAL && !(tsk)->on_cpu) +#else +#define task_asleep(tsk) ((tsk)->state & TASK_NORMAL) +#endif + I kind of remember the on_cpu from before and I was probably complaining about it. Sigh, if it helps ok.. static unsigned int __intel_breadcrumbs_wakeup(struct intel_breadcrumbs *b) { struct intel_wait *wait; @@ -36,8 +42,20 @@ static unsigned int __intel_breadcrumbs_wakeup(struct intel_breadcrumbs *b) wait = b->irq_wait; if (wait) { + /* + * N.B. Since task_asleep() and ttwu are not atomic, the + * waiter may actually go to sleep after the check, causing + * us to suppress a valid wakeup. We prefer to reduce the + * number of false positive missed_breadcrumb() warnings + * at the expense of a few false negatives, as it it easy + * to trigger a false positive under heavy load. Enough + * signal should remain from genuine missed_breadcrumb() + * for us to detect in CI. + */ + bool was_asleep = task_asleep(wait->tsk); + result = ENGINE_WAKEUP_WAITER; - if (wake_up_process(wait->tsk)) + if (wake_up_process(wait->tsk) && was_asleep) result |= ENGINE_WAKEUP_ASLEEP; } @@ -47,12 +65,15 @@ static unsigned int __intel_breadcrumbs_wakeup(struct intel_breadcrumbs *b) unsigned int intel_engine_wakeup(struct intel_engine_cs *engine) { struct intel_breadcrumbs *b = >breadcrumbs; - unsigned long flags; - unsigned int result; + unsigned int result = 0; - spin_lock_irqsave(>irq_lock, flags); - result = __intel_breadcrumbs_wakeup(b); - spin_unlock_irqrestore(>irq_lock, flags); + if (READ_ONCE(b->irq_wait)) { + unsigned long flags; + + spin_lock_irqsave(>irq_lock, flags); + result = __intel_breadcrumbs_wakeup(b); + spin_unlock_irqrestore(>irq_lock, flags); + } This hunk I'd leave out from the fix. And if I postpone that hunk to tomorrow, would r-b the rest? Yep. Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Only report a wakeup if the waiter was truly asleep
Quoting Tvrtko Ursulin (2017-12-11 16:10:49) > > On 09/12/2017 12:47, Chris Wilson wrote: > > If we attempt to wake up a waiter, who is currently checking the seqno > > it will be in the TASK_INTERRUPTIBLE state and ttwu will report success. > > However, it is actually awake and functioning -- so delay reporting the > > actual wake up until it sleeps. This fixes some spurious claims of > > missed_breadcrumbs when running under heavy load; i.e. sufficient load to > > preempt away the newly woken waiter before they complete their checks. > > However, it does so at the cost of a rare false negative; where the > > waiter changes between the check and ttwu -- the only way to fix that > > would be to extend the reporting from ttwu where the check could be done > > atomically. > > > > v2: Defend against !CONFIG_SMP > > v3: Don't filter out calls to wake_up_process > > > > Testcase: igt/drv_missed_irq # sanity check we do detect missed_breadcrumb() > > Testcase: igt/gem_concurrent_blit # for generating false positives > > References: https://bugs.freedesktop.org/show_bug.cgi?id=17 > > Signed-off-by: Chris Wilson> > Cc: Tvrtko Ursulin > > Cc: Joonas Lahtinen > > --- > > drivers/gpu/drm/i915/intel_breadcrumbs.c | 39 > > > > 1 file changed, 30 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c > > b/drivers/gpu/drm/i915/intel_breadcrumbs.c > > index 24c6fefdd0b1..76e6f8e7cfd4 100644 > > --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c > > +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c > > @@ -27,6 +27,12 @@ > > > > #include "i915_drv.h" > > > > +#ifdef CONFIG_SMP > > +#define task_asleep(tsk) ((tsk)->state & TASK_NORMAL && !(tsk)->on_cpu) > > +#else > > +#define task_asleep(tsk) ((tsk)->state & TASK_NORMAL) > > +#endif > > + > > I kind of remember the on_cpu from before and I was probably complaining > about it. Sigh, if it helps ok.. > > > static unsigned int __intel_breadcrumbs_wakeup(struct intel_breadcrumbs > > *b) > > { > > struct intel_wait *wait; > > @@ -36,8 +42,20 @@ static unsigned int __intel_breadcrumbs_wakeup(struct > > intel_breadcrumbs *b) > > > > wait = b->irq_wait; > > if (wait) { > > + /* > > + * N.B. Since task_asleep() and ttwu are not atomic, the > > + * waiter may actually go to sleep after the check, causing > > + * us to suppress a valid wakeup. We prefer to reduce the > > + * number of false positive missed_breadcrumb() warnings > > + * at the expense of a few false negatives, as it it easy > > + * to trigger a false positive under heavy load. Enough > > + * signal should remain from genuine missed_breadcrumb() > > + * for us to detect in CI. > > + */ > > + bool was_asleep = task_asleep(wait->tsk); > > + > > result = ENGINE_WAKEUP_WAITER; > > - if (wake_up_process(wait->tsk)) > > + if (wake_up_process(wait->tsk) && was_asleep) > > result |= ENGINE_WAKEUP_ASLEEP; > > } > > > > @@ -47,12 +65,15 @@ static unsigned int __intel_breadcrumbs_wakeup(struct > > intel_breadcrumbs *b) > > unsigned int intel_engine_wakeup(struct intel_engine_cs *engine) > > { > > struct intel_breadcrumbs *b = >breadcrumbs; > > - unsigned long flags; > > - unsigned int result; > > + unsigned int result = 0; > > > > - spin_lock_irqsave(>irq_lock, flags); > > - result = __intel_breadcrumbs_wakeup(b); > > - spin_unlock_irqrestore(>irq_lock, flags); > > + if (READ_ONCE(b->irq_wait)) { > > + unsigned long flags; > > + > > + spin_lock_irqsave(>irq_lock, flags); > > + result = __intel_breadcrumbs_wakeup(b); > > + spin_unlock_irqrestore(>irq_lock, flags); > > + } > > This hunk I'd leave out from the fix. And if I postpone that hunk to tomorrow, would r-b the rest? -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Only report a wakeup if the waiter was truly asleep
On 09/12/2017 12:47, Chris Wilson wrote: If we attempt to wake up a waiter, who is currently checking the seqno it will be in the TASK_INTERRUPTIBLE state and ttwu will report success. However, it is actually awake and functioning -- so delay reporting the actual wake up until it sleeps. This fixes some spurious claims of missed_breadcrumbs when running under heavy load; i.e. sufficient load to preempt away the newly woken waiter before they complete their checks. However, it does so at the cost of a rare false negative; where the waiter changes between the check and ttwu -- the only way to fix that would be to extend the reporting from ttwu where the check could be done atomically. v2: Defend against !CONFIG_SMP v3: Don't filter out calls to wake_up_process Testcase: igt/drv_missed_irq # sanity check we do detect missed_breadcrumb() Testcase: igt/gem_concurrent_blit # for generating false positives References: https://bugs.freedesktop.org/show_bug.cgi?id=17 Signed-off-by: Chris WilsonCc: Tvrtko Ursulin Cc: Joonas Lahtinen --- drivers/gpu/drm/i915/intel_breadcrumbs.c | 39 1 file changed, 30 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c index 24c6fefdd0b1..76e6f8e7cfd4 100644 --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c @@ -27,6 +27,12 @@ #include "i915_drv.h" +#ifdef CONFIG_SMP +#define task_asleep(tsk) ((tsk)->state & TASK_NORMAL && !(tsk)->on_cpu) +#else +#define task_asleep(tsk) ((tsk)->state & TASK_NORMAL) +#endif + I kind of remember the on_cpu from before and I was probably complaining about it. Sigh, if it helps ok.. static unsigned int __intel_breadcrumbs_wakeup(struct intel_breadcrumbs *b) { struct intel_wait *wait; @@ -36,8 +42,20 @@ static unsigned int __intel_breadcrumbs_wakeup(struct intel_breadcrumbs *b) wait = b->irq_wait; if (wait) { + /* +* N.B. Since task_asleep() and ttwu are not atomic, the +* waiter may actually go to sleep after the check, causing +* us to suppress a valid wakeup. We prefer to reduce the +* number of false positive missed_breadcrumb() warnings +* at the expense of a few false negatives, as it it easy +* to trigger a false positive under heavy load. Enough +* signal should remain from genuine missed_breadcrumb() +* for us to detect in CI. +*/ + bool was_asleep = task_asleep(wait->tsk); + result = ENGINE_WAKEUP_WAITER; - if (wake_up_process(wait->tsk)) + if (wake_up_process(wait->tsk) && was_asleep) result |= ENGINE_WAKEUP_ASLEEP; } @@ -47,12 +65,15 @@ static unsigned int __intel_breadcrumbs_wakeup(struct intel_breadcrumbs *b) unsigned int intel_engine_wakeup(struct intel_engine_cs *engine) { struct intel_breadcrumbs *b = >breadcrumbs; - unsigned long flags; - unsigned int result; + unsigned int result = 0; - spin_lock_irqsave(>irq_lock, flags); - result = __intel_breadcrumbs_wakeup(b); - spin_unlock_irqrestore(>irq_lock, flags); + if (READ_ONCE(b->irq_wait)) { + unsigned long flags; + + spin_lock_irqsave(>irq_lock, flags); + result = __intel_breadcrumbs_wakeup(b); + spin_unlock_irqrestore(>irq_lock, flags); + } This hunk I'd leave out from the fix. return result; } @@ -77,8 +98,8 @@ static noinline void missed_breadcrumb(struct intel_engine_cs *engine) static void intel_breadcrumbs_hangcheck(struct timer_list *t) { - struct intel_engine_cs *engine = from_timer(engine, t, - breadcrumbs.hangcheck); + struct intel_engine_cs *engine = + from_timer(engine, t, breadcrumbs.hangcheck); struct intel_breadcrumbs *b = >breadcrumbs; if (!b->irq_armed) @@ -104,7 +125,7 @@ static void intel_breadcrumbs_hangcheck(struct timer_list *t) */ if (intel_engine_wakeup(engine) & ENGINE_WAKEUP_ASLEEP) { missed_breadcrumb(engine); - mod_timer(>breadcrumbs.fake_irq, jiffies + 1); + mod_timer(>fake_irq, jiffies + 1); } else { mod_timer(>hangcheck, wait_timeout()); } I'll turn a blind eye to this one. :) Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Only report a wakeup if the waiter was truly asleep
Hi Chris, [auto build test ERROR on drm-intel/for-linux-next] [also build test ERROR on next-20170405] [cannot apply to v4.11-rc5] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Only-report-a-wakeup-if-the-waiter-was-truly-asleep/20170405-165353 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: i386-randconfig-sb0-04050506 (attached as .config) compiler: gcc-5 (Debian 5.4.1-2) 5.4.1 20160904 reproduce: # save the attached .config to linux build tree make ARCH=i386 All errors (new ones prefixed by >>): drivers/gpu/drm/i915/intel_breadcrumbs.c: In function '__wake_up_sleeper': >> drivers/gpu/drm/i915/intel_breadcrumbs.c:38:13: error: 'struct task_struct' >> has no member named 'on_cpu' return !tsk->on_cpu && wake_up_process(tsk); ^ drivers/gpu/drm/i915/intel_breadcrumbs.c:39:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ vim +38 drivers/gpu/drm/i915/intel_breadcrumbs.c 32 /* Be careful not to report a successful wakeup if the waiter is 33 * currently processing the seqno, where it will have already 34 * called set_task_state(TASK_INTERRUPTIBLE). We first check whether 35 * the task is currently asleep before calling ttwu, and then we 36 * only report success if we were the ones to then trigger the wakeup. 37 */ > 38 return !tsk->on_cpu && wake_up_process(tsk); 39 } 40 41 static unsigned int __intel_breadcrumbs_wakeup(struct intel_breadcrumbs *b) --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Only report a wakeup if the waiter was truly asleep
Hi Chris, [auto build test ERROR on drm-intel/for-linux-next] [also build test ERROR on next-20170405] [cannot apply to v4.11-rc5] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Only-report-a-wakeup-if-the-waiter-was-truly-asleep/20170405-165353 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: x86_64-randconfig-x004-201714 (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All error/warnings (new ones prefixed by >>): drivers/gpu//drm/i915/intel_breadcrumbs.c: In function '__wake_up_sleeper': >> drivers/gpu//drm/i915/intel_breadcrumbs.c:38:13: error: 'struct task_struct' >> has no member named 'on_cpu'; did you mean 'on_rq'? return !tsk->on_cpu && wake_up_process(tsk); ^~ >> drivers/gpu//drm/i915/intel_breadcrumbs.c:39:1: warning: control reaches end >> of non-void function [-Wreturn-type] } ^ vim +38 drivers/gpu//drm/i915/intel_breadcrumbs.c 32 /* Be careful not to report a successful wakeup if the waiter is 33 * currently processing the seqno, where it will have already 34 * called set_task_state(TASK_INTERRUPTIBLE). We first check whether 35 * the task is currently asleep before calling ttwu, and then we 36 * only report success if we were the ones to then trigger the wakeup. 37 */ > 38 return !tsk->on_cpu && wake_up_process(tsk); > 39 } 40 41 static unsigned int __intel_breadcrumbs_wakeup(struct intel_breadcrumbs *b) 42 { --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx