Re: Only 70% of theoretical peak performance on FreeBSD 8/amd64, Corei7 920

Andriy Gapon Mon, 12 Apr 2010 07:42:54 -0700

on 12/04/2010 07:12 Maho NAKATA said the following:
> Hi FreeBSD developers,
> [the original article in Japanese can be found at
> http://blog.goo.ne.jp/nakatamaho/e/b5f6fbc3cc6e1ac4947463eb1ca4eb0a ] 
> 
> *Abstract*
> I compared the peak performance of FreeBSD 8.0/amd64 and Ubuntu 9.10 amd64 
> using dgemm
> (a linear algebra routine, matrix-matrix multiplication).
> I obtained only 70% of theoretical peak performance on FreeBSD 8/amd64 and
> almost 95% on Ubuntu 9.10 /amd64. I'm really disappointed.


Sorry about that, but more important question (for us) is: are you willing to 
help
us improve in addition to reporting your results?

> *Introduction*
> I'm a friend of Gotoh Kazushige, the principal developers of GotoBLAS. He 
> told me that
> FreeBSD is not suitable OS for scientific computing or high performance 
> computing. He says
> (in Japanese and my translation):
> 
>> I guess FreeBSD does page coloring, but I don't think FreeBSD considers very 
>> large cache
>> size which recent CPU has.

AFAIK, recent FreeBSD doesn't use page coloring anymore.

>> Support of a very large cache on Linux is still not very will
>> sophisticated, but on *BSDs, its worst; they uses too fine memory allocation 
>> method, 
>> so we cannot expect large continuous physical memory allocation.

Can your friend provide more explanation about these points in technical terms?
E.g. what kind of support, in his opinion, is needed for very large caches?
Why, in his opinion, the memory needs to be physically contiguous?

Perhaps, he talks about support of large pages (2M) and related improvements in
TLB performance.  If so, he (and you) may read about 'superpages' feature of 
FreeBSD.
I am not sure if it is enabled by default in 8.0, you can check 
vm.pmap.pg_ps_enabled.

>> Moreover, process scheduling is not so nice as *BSD employs an algorithm that
>> changes physical CPUs in turn instead of allocating one core for such kind 
>> of jobs.
>> Take your own benchmark, and you'll see..

Here I can only add an anecdotal 'me too'.
Sometimes I run single-threaded high-cpu programs like ffmpeg transcoding on
otherwise idle system (a bunch of system daemons in background).
And I see that the cpu-consuming process frequently goes back and forth between 
my
two cores.  CPU user loads on the cores are something like 60% vs 40%.
My expectations were that the process would mostly run on one core while the 
rest
of the threads would mostly run on the other.
I am not sure if that core switching really hurts performance and if there is
something wrong about it.  But somehow it seems 'counter-intuitive'.

> *Result*
> Machine: Core i7 920 (42.56-44.8Gflops) / DDR3 1066
> OS: FreeBSD 8.0/amd64 and Ubuntu 9.10
> GotoBLAS2: 1.13
> 
> dgemm result
> OS      : FLOPS           : percent in peak
> FreeBSD : 32.0 GFlops     : 71%
> Ubuntu  : 42.0-42.7GFlops : 93.8%-95.3%

It would also be get good to learn more about your program.
How much memory does it typically use, how does it allocate it?
Is it single-threaded or not?  If not, how many threads does it have and what do
they do, how do they communicate?

-- 
Andriy Gapon
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Only 70% of theoretical peak performance on FreeBSD 8/amd64, Corei7 920

Reply via email to