On Thu, Jan 11, 2018 at 12:34 PM, Dmitry Safonov <d...@arista.com> wrote: > On Thu, 2018-01-11 at 12:22 -0800, Linus Torvalds wrote: >> On Thu, Jan 11, 2018 at 12:16 PM, Eric Dumazet <eduma...@google.com> >> wrote: >> > >> > Note that when I implemented TCP Small queues, I did experiments >> > between >> > using a work queue or a tasklet, and workqueues added unacceptable >> > P99 >> > latencies, when many user threads are competing with kernel >> > threads. >> >> Yes. >> >> So I think one solution might be to have a hybrid system, where we do >> the softirq's synchronously normally (which is what you really want >> for good latency). >> >> But then fall down on a threaded model - but that fallback case >> should >> be per-softirq, not global. So if one softirq uses a lot of CPU time, >> that shouldn't affect the latency of other softirqs. >> >> So maybe we could get rid of the per-cpu ksoftirqd entirely, and >> replace it with with per-cpu and per-softirq workqueues? >> >> Would something like that sound sane? >> >> Just a SMOP/SMOT (small matter of programming/testing). > > I could try to write a PoC for that.. > What should be the trigger to fall into workqueue? > How to tell if there're too many softirqs of the kind? > Current logic with if (pending) in the end of __do_softirq() > looks working selectively.. > It looks to be still possible to starve a cpu.
I guess we would need to track amount of time spent while processing sortirq (while interrupting a non idle task)