On Tue, 23 Oct 2007, Kris Kennaway wrote:

Josh Carroll wrote:
Anyway, in summary, ULE is about 5-6 % slower than 4BSD for two
workloads that I am sensitive to: building world with -j X, and ffmpeg
-threads X. Other benchmarks seem to indicate relatively equal
performance between the two. MySQL, on the other hand, is
significantly faster in ULE.

5-6% is a lot.  ULE has some tuning for makeworld in -current, which
for me reduced it to less than 1% slower than 4BSD (down from 5-10%
slower), for the case of makeworld -j4 over nfs on a 2-CPU system with
the sources pre-cached on the server and objects on a local file system,
and extensive local tuning of makeworld, nfs and network drivers.  I
think the tuning in ULE was mainly for a 2-CPU system, because makeworld
seemed to be very bad under ULE only with 2 CPUs.  Apparently, it is also
very bad with more CPUs.  There are sysctls to modify the ULE tuning.

I'm trying to understand why ffmpeg and buildworld are slower in ULE
than 4BSD, since it seems to me that ULE was supposed to be the better
scaling scheduler.

Makeworld is slower because any scheduling is bad for it.  More context
switches take longer and cost more by reducing affinity.

Does anyone have any additional performance tests I can run that might
help indicate where the deficiency is in the ULE scheduler? MySQL
performance is excellent, so I'm wondering if it was tuned to that
particular workload?

I think it was.

One major difference is that your workload is 100% user. Also you were reporting ULE had more idle time, which looks like a bug since I would expect it be basically 0% idle on such a workload.

No, at least buildworld, while being mainly user-CPU-bound by the gcc
hog, does some disk accesses and a significant number of sycalls.  I
have to work very hard to reduce its idle time to about 5% for UP on
local disks and to 11% for 2-way SMP over nfs.

More idle time for ULE at least used to be a feature.  ULE sometimes wants
to avoid switching to another thread immediately, in the hope of finding
a thread with with better affinity than the currently runnable ones.  It
waited far too long (in its idle threads) for makeworld with 2 CPUs.
Waiting has a better chance of being best if there are many CPUs.

Bruce
_______________________________________________
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to