On Tue, May 6, 2014 at 1:35 PM, Linus Torvalds <torva...@linux-foundation.org> wrote: > > Heh. That is pretty disgusting. But I guess it could be interesting > for timing. BRB.
Ooh. That's friggin impressive. Guys, see if you can recreate these numbers. This is my totally disgusting test-case, which really is just stress-testing page faults and nothing else. Silly C file attached, see the comment at the top of it. Then just do "time ./a.out". It's designed to map the zero-page and access it. The "start" thing was to make sure it's not hugepage-aligned, but that's not actually enough with a big 1GB area, so you do need that whole "echo never" thing since there will be tons of aligned areas that the kernel will make noops for this case otherwise. Anyway, on my Haswell with normal "iret", that program takes 8.4+-0.1 seconds. With the disgusting sysret hackery, it takes 6.5+-0.1 seconds. That's a rather impressive 23% performance improvement for page faulting. I'll do profiles and test the kernel compile too, but the raw timings are certainly promising. The "sysret" hack is pretty disgusting, and it's broken too. sysret doesn't do some things iret does (like TF flag etc), so it's not complete, but it's clearly good enough to run tests on. It will definitely break ptrace() and friends. Linus
// // Make sure to do // // echo never >/sys/kernel/mm/transparent_hugepage/enabled // // to disable THP for this stupid test-case. #include <stdio.h> #include <sys/types.h> #include <sys/mman.h> #include <unistd.h> #define SIZE (1024*1024*1024) int main(int argc, char **argv) { void *addr, *start; int i; start = 8192 + mmap(NULL, 4096, PROT_READ, MAP_PRIVATE | MAP_ANON, -1, 0); start = (void *)(8192 | (unsigned long) start); for (i = 0; i < 100; i++) { unsigned int j; addr = mmap(start, SIZE, PROT_READ, MAP_PRIVATE | MAP_ANON, -1, 0); for (j = 0; j < SIZE; j += 4096) { *(volatile int *)(j+addr); } munmap(addr, SIZE); } return 0; }