Hi,
I have improved my amd64 pmap patches a bit and finally got around to
doing some benchmarks with them.
Systems:
- core-i7 is an Intel Core i7-860 with 4 cores / 8 threads
- core2duo is an Intel Core 2 Duo T8100 with 2 cores
- KVM is a VM with 4 vcpus running on qemu 2.1/Linux 3.16 on the
above core-i7
SP/MP means single/multi processor kernel
Benchmarks:
kernel: compilation of the kernel on tmpfs
makeindex: 'make index' in /usr/ports on tmpfs
forktest: the fork/exit micro benchmark I have posted at
http://permalink.gmane.org/gmane.os.openbsd.tech/38172
Speed up is wall clock time without patch divided by time with patch.
system kernel benchmark speed up
core-i7 MP kernel -j8 1.02
core-i7 MP makeindex 1.11
core-i7 MP forktest 1.52
core-i7 SP kernel -j1 1.00
core-i7 SP forktest 1.03
core2duo MP kernel -j2 1.17
core2duo MP makeindex 1.12
core2duo MP forktest 1.11
KVM MP kernel -j4 1.20
KVM MP forktest 3.26
As you see there is no test case where the new pmap is slower than the old
one. On bare metal, there is some speed up, depending on system and
workload. On virtualization, the speed difference very significant. I
didn't run the makeindex test on KVM because it is so slow with the old
pmap.
I would like people to test the diffs on other machines. In particular on
non-Intel CPUs. The only AMD system I could get hold of did not run
reliably with openbsd even without the pmap diff.
I am posting the patch in two parts as separate mails. The first part
contains some simple cleanups and the second part does the actual
optimizations. Compared to the patch I have posted on August 18th, the
current patch uses the direct mapping in some cases to avoid more TLB
flushes (some bits of this come from a different old patch by art@ from
2005). Also, many comments have been updated.
Apart from testing, I would of course also appreciate reviews. Compared to
art's patch from 2005 which tried to use the direct mapping for
everything, the current patch is much less intrusive. Therefore I hope
that it is easier to review/debug than his patch, which unfortunately
never got integrated.
Cheers,
Stefan