[v8-dev] Re: Kernel signaling boosts are potentially hurting Chrome

Sami Kyostila Tue, 28 Aug 2018 09:26:31 -0700

I think I've seen instances of this problem even with the old IPC system:
the sending thread is likely to get descheduled because the receiving
thread is woken up before the former finished running. We kicked around an
idea once about buffering message sends and only flushing them once the
current task is finished -- maybe it would be time to revisit something
like that?


- Sami

ke 22. elok. 2018 klo 1.15 Bruce Dawson (brucedaw...@chromium.org)
kirjoitti:

> I've definitely been bitten by this. On one game engine that I worked on
> they would signal all of the worker threads when a task was ready. Due to
> the priority boosting all of them would wake up and try to acquire the
> scheduler lock. The scheduler lock was held by the thread that had signaled
> all of the worker threads, which was reliably no longer running. And oh, by
> the way, it was a spin lock, so the main thread couldn't release because it
> wasn't running. The call to SetEvent() would frequently take 20 ms to
> return.
>
> There were a lot of problems with this:
>
>    - Don't signal all of your worker threads when you have just one task
>    - Don't use a spin lock
>
> In this case the priority raising made the issues critical, but it wasn't
> the underlying issue.
>
> I commented on the bug. I do think this is worth exploring, but there are
> probably cases where we rely on this priority boost to avoid starvation or
> improve response times. It's possible that we'd see better results by
> somehow reducing the number of cross-thread/cross-process messages we send,
> somehow.
>
> Also, note that on systems with enough cores the priority boost can become
> irrelevant - two communicating threads will migrate to different cores and
> both will continue running. So, our workstations will behave fundamentally
> differently from customer machines. Yay.
>
> On Mon, Aug 20, 2018 at 4:37 PM Gabriel Charette <g...@chromium.org> wrote:
>
>> Hello scheduler devs (and *v8/chromium-mojo* friends -- sorry for
>> cross-posting; see related note below).
>>
>> Some kernels give a boost to a thread when the resource it was waiting on
>> is signaled (lock, event, pipe, file I/O, etc.). Some platforms document
>> this
>> <https://docs.microsoft.com/en-us/windows/desktop/procthread/priority-boosts>;
>> on others we've anecdotally observed things that make us believe they do.
>>
>> I think this might be hurting Chrome's task system.
>>
>> The Chrome semantics when signaling a thread is often "hey, you have
>> work, you should run soon"; not "hey, please do this work ASAP"; I think...
>> This is certainly the case for TaskScheduler use cases, I'm not so sure
>> about input use cases (e.g. 16 thread hops to respond to input IIRC; boost
>> probably helps that chain a lot..?).
>> But in a case where there are many messages (e.g. *mojo*), this means
>> many context switches (send one message; switch; process one message;
>> switch back; etc.).
>>
>> https://crbug.com/872248#c4 suggests that MessageLoop::ScheduleWork() is
>> really expensive (though there may be sampling bias there -- investigation
>> in progress).
>>
>> https://crbug.com/872248 also suggests that the Blink main thread is
>> descheduled while it's trying to signal workers to help it on a parallel
>> task (I've observed this first hand when working in *v8* this winter but
>> didn't know what to think of it then trace1
>> <https://drive.google.com/file/d/1YFC8lh67rCEQOMA2_A8i7BlFw_NHkCma/view?usp=sharing>
>>  trace2
>> <https://drive.google.com/file/d/1prrkIlNApLNeu-ppL_5PQT8a2opgKubb/view?usp=sharing>
>> ).
>>
>> On Windows we can tweak this with
>> ::SetProcessPriorityBoost/SetThreadPriorityBoost(). Not sure about POSIX. I
>> might try to experiment with this (feels scary..!).
>>
>> In the meantime I figured it would at least be good to inform all of you
>> so you no longer scratch your head at these occasional unexplained latency
>> delays in traces.
>>
>> Cheers!
>> Gab
>>
> --
> You received this message because you are subscribed to the Google Groups
> "scheduler-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to scheduler-dev+unsubscr...@chromium.org.
> To post to this group, send email to scheduler-...@chromium.org.
> To view this discussion on the web visit
> https://groups.google.com/a/chromium.org/d/msgid/scheduler-dev/CAE5mQiNLNRQiCyv%2BNLU8X7ToQ-s-wRt%2BQgy2B%2BcjwuEKfCu%2B5g%40mail.gmail.com
> <https://groups.google.com/a/chromium.org/d/msgid/scheduler-dev/CAE5mQiNLNRQiCyv%2BNLU8X7ToQ-s-wRt%2BQgy2B%2BcjwuEKfCu%2B5g%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
-- 
v8-dev mailing list
v8-dev@googlegroups.com
http://groups.google.com/group/v8-dev
--- 
You received this message because you are subscribed to the Google Groups 
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[v8-dev] Re: Kernel signaling boosts are potentially hurting Chrome

Reply via email to