Re: [RFT][patch] Scheduling for HTT and not only

Alexander Motin Mon, 09 Apr 2012 12:58:48 -0700

On 04/05/12 21:45, Alexander Motin wrote:

On 05.04.2012 21:12, Arnaud Lacombe wrote:

Hi,


[Sorry for the delay, I got a bit sidetrack'ed...]

2012/2/17 Alexander Motin<m...@freebsd.org>:

On 17.02.2012 18:53, Arnaud Lacombe wrote:


On Fri, Feb 17, 2012 at 11:29 AM, Alexander Motin<m...@freebsd.org>
wrote:


On 02/15/12 21:54, Jeff Roberson wrote:


On Wed, 15 Feb 2012, Alexander Motin wrote:


I've decided to stop those cache black magic practices and focus on
things that really exist in this world -- SMT and CPU load. I've
dropped most of cache related things from the patch and made the
rest
of things more strict and predictable:
http://people.freebsd.org/~mav/sched.htt34.patch



This looks great. I think there is value in considering the other
approach further but I would like to do this part first. It would be
nice to also add priority as a greater influence in the load
balancing
as well.



I haven't got good idea yet about balancing priorities, but I've
rewritten
balancer itself. As soon as sched_lowest() / sched_highest() are more
intelligent now, they allowed to remove topology traversing from the
balancer itself. That should fix double-swapping problem, allow to
keep
some
affinity while moving threads and make balancing more fair. I did
number
of
tests running 4, 8, 9 and 16 CPU-bound threads on 8 CPUs. With 4, 8
and
16
threads everything is stationary as it should. With 9 threads I see
regular
and random load move between all 8 CPUs. Measurements on 5 minutes run
show
deviation of only about 5 seconds. It is the same deviation as I see
caused
by only scheduling of 16 threads on 8 cores without any balancing
needed
at
all. So I believe this code works as it should.

Here is the patch: http://people.freebsd.org/~mav/sched.htt40.patch

I plan this to be a final patch of this series (more to come :))
and if
there will be no problems or objections, I am going to commit it
(except
some debugging KTRs) in about ten days. So now it's a good time for
reviews
and testing. :)

is there a place where all the patches are available ?



All my scheduler patches are cumulative, so all you need is only the
last
mentioned here sched.htt40.patch.

You may want to have a look to the result I collected in the
`runs/freebsd-experiments' branch of:

https://github.com/lacombar/hackbench/

and compare them with vanilla FreeBSD 9.0 and -CURRENT results
available in `runs/freebsd'. On the dual package platform, your patch
is not a definite win.

But in some cases, especially for multi-socket systems, to let it
show its
best, you may want to apply additional patch from avg@ to better
detect CPU
topology:
https://gitorious.org/~avg/freebsd/avgbsd/commit/6bca4a2e4854ea3fc275946a023db65c483cb9dd

test I conducted specifically for this patch did not showed much
improvement...


If I understand right, this test runs thousands of threads sending and
receiving data over the pipes. It is quite likely that all CPUs will be
always busy and so load balancing is not really important in this test,
What looks good is that more complicated new code is not slower then old
one.

While this test seems very scheduler-intensive, it may depend on many
other factors, such as syscall performance, context switch, etc. I'll
try to play more with it.

My profiling on 8-core Core i7 system shows that code from sched_ule.cstaying on first places consumes still only 13% of kernel CPU time,while doing million of context switches per second. cpu_search(),affected by this patch, even less -- only 8%. The rest of time is spreadbetween many small other functions. I did some optimizations at r234066to reduce cpu_search(0 time to 6%, but looking on how unstable resultsof this test are, hardly any difference there can be really measured by it.

I have strong feeling that while this test may be interesting forprofiling, it's own results in first place depend not from how fastscheduler is, but from the pipes capacity and other alike things. Cansomebody hint me what except pipe capacity and context switch tounblocked receiver prevents sender from sending all data in batch andthen receiver from receiving them all in batch? If different OSes havedifferent policies there, I think results could be incomparable.


--
Alexander Motin
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: [RFT][patch] Scheduling for HTT and not only

Reply via email to