Here are the results from one of the benchmarks that performs
particularly poorly when thp is enabled.  Unfortunately the vclear
patches don't seem to provide a performance boost.  I've attached
the patches that include the changes I had to make to get the vclear
patches applied to the latest kernel.

This first set of tests was run on the latest community kernel, with the
vclear patches:

Kernel string: Kernel 3.11.0-rc5-medusa-00021-g1a15a96-dirty
harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# 
cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# 
time ./run.sh
...
Done. Terminating the simulation.

real    25m34.052s
user    10769m7.948s
sys     37m46.524s

harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# 
echo never > /sys/kernel/mm/transparent_hugepage/enabled
harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# 
cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# 
time ./run.sh
...
Done. Terminating the simulation.

real    5m0.377s
user    2202m0.684s
sys     108m31.816s

Here are the same tests on the clean kernel:

Kernel string: Kernel 3.11.0-rc5-medusa-00013-g584d88b

Kernel string: Kernel 3.11.0-rc5-medusa-00013-g584d88b
athorlton@harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l>
 cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
athorlton@harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l>
 time ./run.sh
...
Done. Terminating the simulation.

real    21m44.052s
user    10809m55.356s
sys     39m58.300s


harp31-sys:~ # echo never > /sys/kernel/mm/transparent_hugepage/enabled
athorlton@harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l>
 cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
athorlton@harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l>
 time ./run.sh
...
Done. Terminating the simulation.

real    4m52.502s
user    2127m18.548s
sys     104m50.828s

Working on getting some more information about the root of the
performance issues now...

Alex Thorlton (8):
  THP: Use real address for NUMA policy
  mm: make clear_huge_page tolerate non aligned address
  THP: Pass real, not rounded, address to clear_huge_page
  x86: Add clear_page_nocache
  mm: make clear_huge_page cache clear only around the fault address
  x86: switch the 64bit uncached page clear to SSE/AVX v2
  remove KM_USER0 from kmap_atomic call
  fix up references to kernel_fpu_begin/end

 arch/x86/include/asm/page.h          |  2 +
 arch/x86/include/asm/string_32.h     |  5 ++
 arch/x86/include/asm/string_64.h     |  5 ++
 arch/x86/lib/Makefile                |  1 +
 arch/x86/lib/clear_page_nocache_32.S | 30 ++++++++++++
 arch/x86/lib/clear_page_nocache_64.S | 92 ++++++++++++++++++++++++++++++++++++
 arch/x86/mm/fault.c                  |  7 +++
 mm/huge_memory.c                     | 17 +++----
 mm/memory.c                          | 31 ++++++++++--
 9 files changed, 179 insertions(+), 11 deletions(-)
 create mode 100644 arch/x86/lib/clear_page_nocache_32.S
 create mode 100644 arch/x86/lib/clear_page_nocache_64.S

-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to