On 2/6/12 11:10 AM, Alexander Best wrote:
On Mon Feb  6 12, Alexander Motin wrote:
On 02/06/12 19:37, Tijl Coosemans wrote:
On Monday 06 February 2012 17:29:14 Alexander Motin wrote:
On 02/06/12 18:01, Alexander Best wrote:
On Mon Feb  6 12, Alexander Motin wrote:
I've analyzed scheduler behavior and think found the problem with HTT.
SCHED_ULE knows about HTT and when doing load balancing once a second,
it does right things. Unluckily, if some other thread gets in the way,
process can be easily pushed out to another CPU, where it will stay for
another second because of CPU affinity, possibly sharing physical core
with something else without need.

I've made a patch, reworking SCHED_ULE affinity code, to fix that:
http://people.freebsd.org/~mav/sched.htt.patch

This patch does three things:
   - Disables strict affinity optimization when HTT detected to let more
sophisticated code to take into account load of other logical core(s).
   - Adds affinity support to the sched_lowest() function to prefer
specified (last used) CPU (and CPU groups it belongs to) in case of
equal load. Previous code always selected first valid CPU of evens. It
caused threads migration to lower CPUs without need.
   - If current CPU group has no CPU where the process with its priority
can run now, sequentially check parent CPU groups before doing global
search. That should improve affinity for the next cache levels.

Who wants to do independent testing to verify my results or do some more
interesting benchmarks? :)
i don't have any benchmarks to offer, but i'm seeing a massive increase
in
responsiveness with your patch. with an unpatched kernel, opening xterm
while
unrar'ing some huge archive could take up to 3 minutes!!! with your
patch the
time it takes for xterm to start is never>    10 seconds!!!
Thank you for the report. I can suggest explanation for this. Original
code does only one pass looking for CPU where the thread can run
immediately. That pass limited to the first level of CPU topology (for
HTT systems it is one physical core). If it sees no good candidate, it
just looks for the CPU with minimal load, ignoring thread priority. I
suppose that may lead to priority violation, scheduling thread to CPU
where higher-priority thread is running, where it may wait for a very
long time, while there is some other CPU with minimal priority thread.
My patch does more searches, that allows to handle priorities better.
But why would unrar have a higher priority?
I am not good with ULE priority calculations yet, but I think there
could be many factors. Both GUI and unrar probably running with the same
nice level of 0, so initially they are equal. After they started, their
priorities depend on spent CPU time, calculated "interactivity" and who
knows what else. So possibly at some moments unrar may get priority
higher then GUI. Also there can participate several kernel threads, that
have higher priority by definition.
one factor might also be that by doing i/o, a process can gather a higher
priority compared to a task that's doing cpu intensive stuff. statistics have
shown (at least historically) that i/o intensive tasks tend to trigger i/o
bursts and then either quit or stay idle. so schedulers usually prefer those
processes, since there's quite a big chance their request can be delivered in
one time slice. cpu intensive tasks on the other hand usually require more than
one time slice. this might cause a situation, where the 'unrar'-process is
being handled at a very high priority in order to finish its i/o burst. however
since unrar'ing a big file isn't really an i/o burst, the cpu intensive process
(or any other process in general) will suffer from starvation.

this is just a wild guess though. it might be complete nonsense, since i
haven't got a clue what kind of scheduling (rr, fcfs, ...) ULE is doing.

btw: does anybody know, if there are plans to commit the BFS scheduler to HEAD
at some point? personally i think what be even better than a new schuduling
algorithm would be a scheduling infrastructure similar to GEOM. that way it
would be much easier to implement new algorithms (maybe in XML).

a scheduler 'framework' is rather hard to do because the scheduler interface is "complicated".. I and others did try abstract teh scheduler a bit in about 2000.
As you can see we did manage to have 2 separate schedulers..
adding a third should be easier but the current interface is still slightly tied to the things
that those two schedulers need.

cheers.
alex

--
Alexander Motin
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to