Hi,

I have improved my amd64 pmap patches a bit and finally got around to 
doing some benchmarks with them.

Systems:
- core-i7 is an Intel Core i7-860 with 4 cores / 8 threads
- core2duo is an Intel Core 2 Duo T8100 with 2 cores
- KVM is a VM with 4 vcpus running on qemu 2.1/Linux 3.16 on the
  above core-i7

SP/MP means single/multi processor kernel

Benchmarks:
kernel: compilation of the kernel on tmpfs
makeindex: 'make index' in /usr/ports on tmpfs
forktest: the fork/exit micro benchmark I have posted at
    http://permalink.gmane.org/gmane.os.openbsd.tech/38172

Speed up is wall clock time without patch divided by time with patch.

system   kernel benchmark       speed up
core-i7  MP     kernel -j8      1.02
core-i7  MP     makeindex       1.11
core-i7  MP     forktest        1.52
core-i7  SP     kernel -j1      1.00
core-i7  SP     forktest        1.03
core2duo MP     kernel -j2      1.17
core2duo MP     makeindex       1.12
core2duo MP     forktest        1.11
KVM      MP     kernel -j4      1.20
KVM      MP     forktest        3.26

As you see there is no test case where the new pmap is slower than the old 
one. On bare metal, there is some speed up, depending on system and 
workload. On virtualization, the speed difference very significant. I 
didn't run the makeindex test on KVM because it is so slow with the old 
pmap.

I would like people to test the diffs on other machines. In particular on 
non-Intel CPUs. The only AMD system I could get hold of did not run 
reliably with openbsd even without the pmap diff.


I am posting the patch in two parts as separate mails. The first part 
contains some simple cleanups and the second part does the actual 
optimizations. Compared to the patch I have posted on August 18th, the 
current patch uses the direct mapping in some cases to avoid more TLB 
flushes (some bits of this come from a different old patch by art@ from 
2005). Also, many comments have been updated.

Apart from testing, I would of course also appreciate reviews. Compared to 
art's patch from 2005 which tried to use the direct mapping for 
everything, the current patch is much less intrusive. Therefore I hope 
that it is easier to review/debug than his patch, which unfortunately 
never got integrated.

Cheers,
Stefan

Reply via email to