On 11/12/2015 03:19 PM, Chris Wilson wrote:
So today, I figured I'd try just killing that spin. If it fails, we'll
punt to normal completions, so easy change. And wow, MASSIVE difference.
I can now scroll in chrome and not rage! It's like the laptop is 10x
faster now.
Ran git blame, and found:
commit 2def4ad99befa25775dd2f714fdd4d92faec6e34
Author: Chris Wilson <ch...@chris-wilson.co.uk>
Date: Tue Apr 7 16:20:41 2015 +0100
drm/i915: Optimistically spin for the request completion
and read the commit message. Doesn't sound that impressive. Especially
not for something that screws up interactive performance by a LOT.
What's the deal? Revert?
The tests that it improved the most were the latency sensitive tests and
since my Broadwell xps13 behaves itself, I'd like to understand how it
culminates in an interactivity loss.
1. Maybe it is the uninterruptible nature of the polling, making X's
SIGIO jerky:
This one still feels bad.
2. Or maybe it is increased mutex contention:
And so does this one... I had to manually apply hunks 2-3, and after
doing seat-of-the-pants testing for both variants, I confirmed with perf
that we're still seeing a ton of time in __i915_wait_request() for both
of them.
Or maybe it is an indirect effect, such as power balancing between the
CPU and GPU, or just thermal throttling, or it may be the task being
penalised for consuming its timeslice (for which any completion polling
seems susceptible).
Look, polls in the 1-10ms range are just insane. Either you botched the
commit message and really meant "~1ms at most" and in which case I'd
suspect you of smoking something good, or you hacked it up wrong and
used jiffies when you really wanted to be using some other time check
that really did give you 1us.
I'll take an IRQ over 10 msecs of busy looping on my laptop, thanks.
"Limit the spinning to a single jiffie (~1us) at most"
is totally wrong. I have HZ=100 on my laptop. That's 10ms. 10ms!
Even if I had HZ=1000, that'd still be 1ms of spinning. That's
seriously screwed up, guys.
That's over and above the termination condition for blk_poll().
?! And this is related, how? Comparing apples and oranges. One is a test
opt-in feature for experimentation, the other is unconditionally enabled
for everyone. I believe the commit even says so. See the difference?
Would I use busy loop spinning waiting for rotating storage completions,
which are in the 1-10ms range? No, with the reason being that the
potential wins for spins are in the usec range.
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/