Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
On 2019-09-26 16:44:33 [+0100], Chris Wilson wrote: > > It's all edge interrupts -- although for gen2/3 my memory is hazy. But > > the GPU (circa gen6) can generate more than enough interrupts to saturate > > a CPU. > > So everything older than gen5 has MSI disabled it appears and needs > ONESHOT. Also ACPI/PCI-quirks may decide that MSI is broken on the system and disable it. If you end up with a shared handler, you can't mix ONESHOT among the handlers. So either all have that flag set or none of them. In that case you need to provide a tiny primary handler which just disables the IRQ (in the HW) and the threaded handler has to enable it again (at the end of its routine). > -Chris Sebastian ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
Quoting Chris Wilson (2019-09-26 15:25:38) > Moving our primary irq handler to a RT thread incurs an extra 1us delay > in process interrupts. This is most notice in waking up client threads, > where it adds about 20% of extra latency. It also imposes a delay in > feeding the GPU, an extra 1us before signaling secondary engines and > extra latency in resubmitting work to keep the GPU busy. The latter case > is insignificant as the latency hidden by the active GPU, and > preempt-to-busy ensures that no extra latency is incurred for > preemption. Fwiw, gem_wsim says the impact is around 1.25% for transcode (just looking at the first hour or so of runs), which is enough to put the idea on the back burner for now. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
Quoting Brian Welty (2019-09-26 19:57:17) > > On 9/26/2019 7:25 AM, Chris Wilson wrote: > > Moving our primary irq handler to a RT thread incurs an extra 1us delay > > in process interrupts. This is most notice in waking up client threads, > > where it adds about 20% of extra latency. It also imposes a delay in > > feeding the GPU, an extra 1us before signaling secondary engines and > > extra latency in resubmitting work to keep the GPU busy. The latter case > > is insignificant as the latency hidden by the active GPU, and > > preempt-to-busy ensures that no extra latency is incurred for > > preemption. > > > > The benefit is that we reduced the impact on the rest of the system, the > > cycletest shows a reduction from 5us mean latency to 2us, with the > > maximum observed latency (in a 2 minute window) reduced by over 160us. > > Hi Chris, > > I'm curious to understand this a little better. > If only benefit of this patch is improving overall system performance, then > can you say why i915 interrupt handling doesn't fit into what I thought was > the common usage of threaded interrupts... The way cyclictest determines system impact is via latency in delivery of a timer to userspace; it observes irq-off time. We use that as a convenient metric to determine how long we block the system while servicing the gpu. > Usually with request_threaded_irq(), I thought typically both handlers > are specified and so you'd only fallback to the threaded handler when the > interrupt context handler is overwhelmed? > Like so: > while (HW events need some action) { > do something ... > if (too overwhelmed) /* ie. reduce load on system */ > return IRQ_WAKE_THREAD; > } > return IRQ_HANDLED; > > Likely you considered something like above. > I'm just looking to understand why using only threaded interrupt handler is > best in this case. > (Also maybe discuss this a little bit further in commit message...?) Convenience. The primary motivation for the patch was to test Sebastian's patches in our CI which are targeting PREEMPT_RT issues and he noted that would also be reproduced in a threaded irq handler. The measurements I did were to assuage my own gut response that a threaded irq handler would be devastating to our latency metrics (although a 1us extra latency is terrible if our goal happens to be sub-1us gpu->client latency!). (I was contemplating adding IRQF_NO_THREAD to avoid the issue.) The problem I feel with the overflow of primary to thread is what value do you decide is acceptable latency for the rest of the system; and why bother to kick off a second thread if all it does is service a tasklet, would we be better served by a dedicated thread per-submission-thread? (Which might sound familiar -- it what's we used before tasklets.) It's definitely a tradeoff; our latency versus the rest of the system. Though there were a few advantages to having an RT thread spawned for the softirq (which I guess is was because RT threads are granted a frequency boost), and if we can reduce the number of irq-off spinlocks that may have much greater advantages across the driver that outweigh the cost of the extra context switch in client wakeup. > Sorry if this was perhaps discussed in response to Tvrtko's question. > I didn't quite follow the fast/slow separation mentioned in that thread. Our interrupt handlers are effectively split into 2 phases. First we read the master control and various IIR, and ack the interrupt, after which the GPU is free to raise an interrupt again. Then we process the interrupt, handling the various bits asserted in the IIR (signaling clients, feeding the GPU its next context, completing vblanks etc). The first ack should be fast -- you can demonstrate the impact of not doing the ack/handler split, by measuring the throughput on multiple engines (a context-switch event on the second engine will be noticeably blocked by the interrupt handler on the first, if you are not careful). (gen11+ breaks this split atm as we haven't worked out how we are meant to ack the banked GT IIR yet, or at least no one has seen Icelake to measure the impact). -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
On 9/26/2019 7:25 AM, Chris Wilson wrote: > Moving our primary irq handler to a RT thread incurs an extra 1us delay > in process interrupts. This is most notice in waking up client threads, > where it adds about 20% of extra latency. It also imposes a delay in > feeding the GPU, an extra 1us before signaling secondary engines and > extra latency in resubmitting work to keep the GPU busy. The latter case > is insignificant as the latency hidden by the active GPU, and > preempt-to-busy ensures that no extra latency is incurred for > preemption. > > The benefit is that we reduced the impact on the rest of the system, the > cycletest shows a reduction from 5us mean latency to 2us, with the > maximum observed latency (in a 2 minute window) reduced by over 160us. Hi Chris, I'm curious to understand this a little better. If only benefit of this patch is improving overall system performance, then can you say why i915 interrupt handling doesn't fit into what I thought was the common usage of threaded interrupts... Usually with request_threaded_irq(), I thought typically both handlers are specified and so you'd only fallback to the threaded handler when the interrupt context handler is overwhelmed? Like so: while (HW events need some action) { do something ... if (too overwhelmed) /* ie. reduce load on system */ return IRQ_WAKE_THREAD; } return IRQ_HANDLED; Likely you considered something like above. I'm just looking to understand why using only threaded interrupt handler is best in this case. (Also maybe discuss this a little bit further in commit message...?) Sorry if this was perhaps discussed in response to Tvrtko's question. I didn't quite follow the fast/slow separation mentioned in that thread. Thanks, -Brian > > Signed-off-by: Chris Wilson > Cc: Joonas Lahtinen > Cc: Tvrtko Ursulin > Cc: Clark Williams > Cc: Sebastian Andrzej Siewior > --- > Note this should need the fixes in > 20190926105644.16703-2-bige...@linutronix.de, by itself this should be a > test vehicle to exercise that patch! > --- > drivers/gpu/drm/i915/i915_irq.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index bc83f094065a..f3df7714a3f3 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -4491,8 +4491,8 @@ int intel_irq_install(struct drm_i915_private *dev_priv) > > intel_irq_reset(dev_priv); > > - ret = request_irq(irq, intel_irq_handler(dev_priv), > - IRQF_SHARED, DRIVER_NAME, dev_priv); > + ret = request_threaded_irq(irq, NULL, intel_irq_handler(dev_priv), > +IRQF_SHARED, DRIVER_NAME, dev_priv); > if (ret < 0) { > dev_priv->drm.irq_enabled = false; > return ret; > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
On 2019-09-26 16:40:34 [+0100], Chris Wilson wrote: > > It's all edge interrupts -- although for gen2/3 my memory is hazy. But > the GPU (circa gen6) can generate more than enough interrupts to saturate > a CPU. :) > -Chris Sebastian ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
Quoting Chris Wilson (2019-09-26 16:40:34) > Quoting Sebastian Andrzej Siewior (2019-09-26 16:32:52) > > On 2019-09-26 16:24:59 [+0100], Chris Wilson wrote: > > > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c > > > > > b/drivers/gpu/drm/i915/i915_irq.c > > > > > index bc83f094065a..f3df7714a3f3 100644 > > > > > --- a/drivers/gpu/drm/i915/i915_irq.c > > > > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > > > > @@ -4491,8 +4491,8 @@ int intel_irq_install(struct drm_i915_private > > > > > *dev_priv) > > > > > > > > > > intel_irq_reset(dev_priv); > > > > > > > > > > - ret = request_irq(irq, intel_irq_handler(dev_priv), > > > > > - IRQF_SHARED, DRIVER_NAME, dev_priv); > > > > > + ret = request_threaded_irq(irq, NULL, > > > > > intel_irq_handler(dev_priv), > > > > > +IRQF_SHARED, DRIVER_NAME, dev_priv); > > > > > > > > I think you should add IRQF_ONESHOT. Otherwise a LEVEL interrupt will > > > > keep interrupting the CPU and you never manage to switch to the thread. > > > > > > The interrupts only keep coming if we feed the GPU, and we only feed the > > > GPU if we service the interrupt. That should provide a natural > > > quiescence :) > > > > In IRQ-context your primary handler gets invoked which wakes the thread > > (what ever intel_irq_handler() returns). Then you leave the IRQ context > > and should switch to your IRQ-handler. This will never happen because > > the IRQ line remains asserted and CPU ends up in the primary handler > > again. > > An EDGE typed IRQ wouldn't notice a difference. But a LINE typed IRQ > > will remain asserted until the hardware de-asserts the interrupt again. > > Since you never reach your thread handler, I don't see how this can > > happen. > > It's all edge interrupts -- although for gen2/3 my memory is hazy. But > the GPU (circa gen6) can generate more than enough interrupts to saturate > a CPU. So everything older than gen5 has MSI disabled it appears and needs ONESHOT. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
Quoting Sebastian Andrzej Siewior (2019-09-26 16:32:52) > On 2019-09-26 16:24:59 [+0100], Chris Wilson wrote: > > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c > > > > b/drivers/gpu/drm/i915/i915_irq.c > > > > index bc83f094065a..f3df7714a3f3 100644 > > > > --- a/drivers/gpu/drm/i915/i915_irq.c > > > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > > > @@ -4491,8 +4491,8 @@ int intel_irq_install(struct drm_i915_private > > > > *dev_priv) > > > > > > > > intel_irq_reset(dev_priv); > > > > > > > > - ret = request_irq(irq, intel_irq_handler(dev_priv), > > > > - IRQF_SHARED, DRIVER_NAME, dev_priv); > > > > + ret = request_threaded_irq(irq, NULL, intel_irq_handler(dev_priv), > > > > +IRQF_SHARED, DRIVER_NAME, dev_priv); > > > > > > I think you should add IRQF_ONESHOT. Otherwise a LEVEL interrupt will > > > keep interrupting the CPU and you never manage to switch to the thread. > > > > The interrupts only keep coming if we feed the GPU, and we only feed the > > GPU if we service the interrupt. That should provide a natural > > quiescence :) > > In IRQ-context your primary handler gets invoked which wakes the thread > (what ever intel_irq_handler() returns). Then you leave the IRQ context > and should switch to your IRQ-handler. This will never happen because > the IRQ line remains asserted and CPU ends up in the primary handler > again. > An EDGE typed IRQ wouldn't notice a difference. But a LINE typed IRQ > will remain asserted until the hardware de-asserts the interrupt again. > Since you never reach your thread handler, I don't see how this can > happen. It's all edge interrupts -- although for gen2/3 my memory is hazy. But the GPU (circa gen6) can generate more than enough interrupts to saturate a CPU. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
On 2019-09-26 16:24:59 [+0100], Chris Wilson wrote: > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c > > > b/drivers/gpu/drm/i915/i915_irq.c > > > index bc83f094065a..f3df7714a3f3 100644 > > > --- a/drivers/gpu/drm/i915/i915_irq.c > > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > > @@ -4491,8 +4491,8 @@ int intel_irq_install(struct drm_i915_private > > > *dev_priv) > > > > > > intel_irq_reset(dev_priv); > > > > > > - ret = request_irq(irq, intel_irq_handler(dev_priv), > > > - IRQF_SHARED, DRIVER_NAME, dev_priv); > > > + ret = request_threaded_irq(irq, NULL, intel_irq_handler(dev_priv), > > > +IRQF_SHARED, DRIVER_NAME, dev_priv); > > > > I think you should add IRQF_ONESHOT. Otherwise a LEVEL interrupt will > > keep interrupting the CPU and you never manage to switch to the thread. > > The interrupts only keep coming if we feed the GPU, and we only feed the > GPU if we service the interrupt. That should provide a natural > quiescence :) In IRQ-context your primary handler gets invoked which wakes the thread (what ever intel_irq_handler() returns). Then you leave the IRQ context and should switch to your IRQ-handler. This will never happen because the IRQ line remains asserted and CPU ends up in the primary handler again. An EDGE typed IRQ wouldn't notice a difference. But a LINE typed IRQ will remain asserted until the hardware de-asserts the interrupt again. Since you never reach your thread handler, I don't see how this can happen. > -Chris Sebastian ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
Quoting Sebastian Andrzej Siewior (2019-09-26 16:13:08) > On 2019-09-26 15:25:38 [+0100], Chris Wilson wrote: > > Moving our primary irq handler to a RT thread incurs an extra 1us delay > > in process interrupts. This is most notice in waking up client threads, > > where it adds about 20% of extra latency. It also imposes a delay in > > feeding the GPU, an extra 1us before signaling secondary engines and > > extra latency in resubmitting work to keep the GPU busy. The latter case > > is insignificant as the latency hidden by the active GPU, and > > preempt-to-busy ensures that no extra latency is incurred for > > preemption. > > > > The benefit is that we reduced the impact on the rest of the system, the > > cycletest shows a reduction from 5us mean latency to 2us, with the > > maximum observed latency (in a 2 minute window) reduced by over 160us. > > cycletest is the name of the tool I don't know about, right? It is not > cyclictest with a typo? It's a tool like cyclictest that measures the timer latency in one thread while flooding the system with GPU work, filling the page cache, grabbing as many THP as possible etc. (It's actually called gem_syslatency.) > I don't know how many interrupts you have in a system but if this is the > only one for the card then you could remove _irqsave() from locking if > the lock is never observed in IRQ context (but then you have irqwork > which requires this irqsave). True, pushing the irq into a thread should allow us to re-evaluate a lot of our spin_lock_irq. > > Signed-off-by: Chris Wilson > > Cc: Joonas Lahtinen > > Cc: Tvrtko Ursulin > > Cc: Clark Williams > > Cc: Sebastian Andrzej Siewior > > --- > > Note this should need the fixes in > > 20190926105644.16703-2-bige...@linutronix.de, by itself this should be a > > test vehicle to exercise that patch! > > --- > > drivers/gpu/drm/i915/i915_irq.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c > > b/drivers/gpu/drm/i915/i915_irq.c > > index bc83f094065a..f3df7714a3f3 100644 > > --- a/drivers/gpu/drm/i915/i915_irq.c > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > @@ -4491,8 +4491,8 @@ int intel_irq_install(struct drm_i915_private > > *dev_priv) > > > > intel_irq_reset(dev_priv); > > > > - ret = request_irq(irq, intel_irq_handler(dev_priv), > > - IRQF_SHARED, DRIVER_NAME, dev_priv); > > + ret = request_threaded_irq(irq, NULL, intel_irq_handler(dev_priv), > > +IRQF_SHARED, DRIVER_NAME, dev_priv); > > I think you should add IRQF_ONESHOT. Otherwise a LEVEL interrupt will > keep interrupting the CPU and you never manage to switch to the thread. The interrupts only keep coming if we feed the GPU, and we only feed the GPU if we service the interrupt. That should provide a natural quiescence :) -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
On 2019-09-26 15:57:07 [+0100], Tvrtko Ursulin wrote: > 2. What about our tasklets - with threaded irqs we don't need them any more, > right? So in this case they just add additional latency. If you enqueue / schedule tasklets from your threaded handler then this will wake up ksoftirqd and perform the work there (based on my memory of the code). > Regards, > > Tvrtko Sebastian ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
On 2019-09-26 15:25:38 [+0100], Chris Wilson wrote: > Moving our primary irq handler to a RT thread incurs an extra 1us delay > in process interrupts. This is most notice in waking up client threads, > where it adds about 20% of extra latency. It also imposes a delay in > feeding the GPU, an extra 1us before signaling secondary engines and > extra latency in resubmitting work to keep the GPU busy. The latter case > is insignificant as the latency hidden by the active GPU, and > preempt-to-busy ensures that no extra latency is incurred for > preemption. > > The benefit is that we reduced the impact on the rest of the system, the > cycletest shows a reduction from 5us mean latency to 2us, with the > maximum observed latency (in a 2 minute window) reduced by over 160us. cycletest is the name of the tool I don't know about, right? It is not cyclictest with a typo? I don't know how many interrupts you have in a system but if this is the only one for the card then you could remove _irqsave() from locking if the lock is never observed in IRQ context (but then you have irqwork which requires this irqsave). > Signed-off-by: Chris Wilson > Cc: Joonas Lahtinen > Cc: Tvrtko Ursulin > Cc: Clark Williams > Cc: Sebastian Andrzej Siewior > --- > Note this should need the fixes in > 20190926105644.16703-2-bige...@linutronix.de, by itself this should be a > test vehicle to exercise that patch! > --- > drivers/gpu/drm/i915/i915_irq.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index bc83f094065a..f3df7714a3f3 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -4491,8 +4491,8 @@ int intel_irq_install(struct drm_i915_private *dev_priv) > > intel_irq_reset(dev_priv); > > - ret = request_irq(irq, intel_irq_handler(dev_priv), > - IRQF_SHARED, DRIVER_NAME, dev_priv); > + ret = request_threaded_irq(irq, NULL, intel_irq_handler(dev_priv), > +IRQF_SHARED, DRIVER_NAME, dev_priv); I think you should add IRQF_ONESHOT. Otherwise a LEVEL interrupt will keep interrupting the CPU and you never manage to switch to the thread. > if (ret < 0) { > dev_priv->drm.irq_enabled = false; > return ret; Sebastian ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
Quoting Tvrtko Ursulin (2019-09-26 15:57:07) > > On 26/09/2019 15:25, Chris Wilson wrote: > > Moving our primary irq handler to a RT thread incurs an extra 1us delay > > in process interrupts. This is most notice in waking up client threads, > > where it adds about 20% of extra latency. It also imposes a delay in > > feeding the GPU, an extra 1us before signaling secondary engines and > > extra latency in resubmitting work to keep the GPU busy. The latter case > > is insignificant as the latency hidden by the active GPU, and > > preempt-to-busy ensures that no extra latency is incurred for > > preemption. > > > > The benefit is that we reduced the impact on the rest of the system, the > > cycletest shows a reduction from 5us mean latency to 2us, with the > > maximum observed latency (in a 2 minute window) reduced by over 160us. > > > > Signed-off-by: Chris Wilson > > Cc: Joonas Lahtinen > > Cc: Tvrtko Ursulin > > Cc: Clark Williams > > Cc: Sebastian Andrzej Siewior > > --- > > Note this should need the fixes in > > 20190926105644.16703-2-bige...@linutronix.de, by itself this should be a > > test vehicle to exercise that patch! > > --- > > drivers/gpu/drm/i915/i915_irq.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c > > b/drivers/gpu/drm/i915/i915_irq.c > > index bc83f094065a..f3df7714a3f3 100644 > > --- a/drivers/gpu/drm/i915/i915_irq.c > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > @@ -4491,8 +4491,8 @@ int intel_irq_install(struct drm_i915_private > > *dev_priv) > > > > intel_irq_reset(dev_priv); > > > > - ret = request_irq(irq, intel_irq_handler(dev_priv), > > - IRQF_SHARED, DRIVER_NAME, dev_priv); > > + ret = request_threaded_irq(irq, NULL, intel_irq_handler(dev_priv), > > +IRQF_SHARED, DRIVER_NAME, dev_priv); > > if (ret < 0) { > > dev_priv->drm.irq_enabled = false; > > return ret; > > > > Two questions: > > 1. Should we split out the master_ctl handling into the fast handler? > Although can we pass anything from the fast to threaded handler? If not > we'd have to re-read the master_ctl from the threaded handler. I did look at using the primary/threaded irq handler to do some fast/slow handling (but RT is probably going to force the primary into a thread, so long term prospect there looked bleak ;) In our pre-gen11 model we could certainly do the ack in the fast primary handler, pushing the processing into the secondary thread. Post gen11 is more annoying and we probably only do the master_ctl immediately. (It's very tempting to put GT handler into fast, but that defeats the PREEMPT_RT objections ;) We can accumulate into i915 anything we want to pass from primary to secondary. Or we can use per-cpu handlers and data structures. > 2. What about our tasklets - with threaded irqs we don't need them any > more, right? So in this case they just add additional latency. Our tasklets run immediately within the RT thread. That appears to be a positive outcome of this. We keep our direct submission so our initial latency is not impacted at all, and the impact of the thread latency is minimised by keeping the GPU busy. So the only (easily?) measurable impact is client wakeup. (I have a modicum of concern that there is a priority inversion here for an RT client waiting on the GPU.) -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
On 26/09/2019 15:25, Chris Wilson wrote: Moving our primary irq handler to a RT thread incurs an extra 1us delay in process interrupts. This is most notice in waking up client threads, where it adds about 20% of extra latency. It also imposes a delay in feeding the GPU, an extra 1us before signaling secondary engines and extra latency in resubmitting work to keep the GPU busy. The latter case is insignificant as the latency hidden by the active GPU, and preempt-to-busy ensures that no extra latency is incurred for preemption. The benefit is that we reduced the impact on the rest of the system, the cycletest shows a reduction from 5us mean latency to 2us, with the maximum observed latency (in a 2 minute window) reduced by over 160us. Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Tvrtko Ursulin Cc: Clark Williams Cc: Sebastian Andrzej Siewior --- Note this should need the fixes in 20190926105644.16703-2-bige...@linutronix.de, by itself this should be a test vehicle to exercise that patch! --- drivers/gpu/drm/i915/i915_irq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index bc83f094065a..f3df7714a3f3 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -4491,8 +4491,8 @@ int intel_irq_install(struct drm_i915_private *dev_priv) intel_irq_reset(dev_priv); - ret = request_irq(irq, intel_irq_handler(dev_priv), - IRQF_SHARED, DRIVER_NAME, dev_priv); + ret = request_threaded_irq(irq, NULL, intel_irq_handler(dev_priv), + IRQF_SHARED, DRIVER_NAME, dev_priv); if (ret < 0) { dev_priv->drm.irq_enabled = false; return ret; Two questions: 1. Should we split out the master_ctl handling into the fast handler? Although can we pass anything from the fast to threaded handler? If not we'd have to re-read the master_ctl from the threaded handler. 2. What about our tasklets - with threaded irqs we don't need them any more, right? So in this case they just add additional latency. Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Delegate our irq handler to a thread
Moving our primary irq handler to a RT thread incurs an extra 1us delay in process interrupts. This is most notice in waking up client threads, where it adds about 20% of extra latency. It also imposes a delay in feeding the GPU, an extra 1us before signaling secondary engines and extra latency in resubmitting work to keep the GPU busy. The latter case is insignificant as the latency hidden by the active GPU, and preempt-to-busy ensures that no extra latency is incurred for preemption. The benefit is that we reduced the impact on the rest of the system, the cycletest shows a reduction from 5us mean latency to 2us, with the maximum observed latency (in a 2 minute window) reduced by over 160us. Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Tvrtko Ursulin Cc: Clark Williams Cc: Sebastian Andrzej Siewior --- Note this should need the fixes in 20190926105644.16703-2-bige...@linutronix.de, by itself this should be a test vehicle to exercise that patch! --- drivers/gpu/drm/i915/i915_irq.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index bc83f094065a..f3df7714a3f3 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -4491,8 +4491,8 @@ int intel_irq_install(struct drm_i915_private *dev_priv) intel_irq_reset(dev_priv); - ret = request_irq(irq, intel_irq_handler(dev_priv), - IRQF_SHARED, DRIVER_NAME, dev_priv); + ret = request_threaded_irq(irq, NULL, intel_irq_handler(dev_priv), + IRQF_SHARED, DRIVER_NAME, dev_priv); if (ret < 0) { dev_priv->drm.irq_enabled = false; return ret; -- 2.23.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx