I played with Transparent Hugepages some time ago and I want to share some 
numbers based on real world high-load applications.
We have a JVM application: high-load tcp server based on netty. No clear 
bottleneck, CPU, memory and network are equally highly loaded. The amount 
of work depends on request content.
The following numbers are based on normal server load ~40% of maximum 
number of requests one server can handle.

*When THP is off:*
End-to-end application latency in microseconds:
"p50" : 718.891,
"p95" : 4110.26,
"p99" : 7503.938,
"p999" : 15564.827,

perf stat -e dTLB-load-misses,iTLB-load-misses -p PID -I 1000
...
...         25,164,369      iTLB-load-misses
...         81,154,170      dTLB-load-misses
...

*When THP is always on:*
End-to-end application latency in microseconds:
"p50" : 601.196,
"p95" : 3260.494,
"p99" : 7104.526,
"p999" : 11872.642,

perf stat -e dTLB-load-misses,iTLB-load-misses -p PID -I 1000
...
...    21,400,513      dTLB-load-misses
...      4,633,644      iTLB-load-misses
...

As you can see THP performance impact is measurable and too significant to 
ignore. 4.1 ms vs 3.2 ms 99%% and 100M vs 25M TLB misses.
I also used SytemTap to measure few kernel functions like 
collapse_huge_page, clear_huge_page, split_huge_page. There were no 
significant spikes using THP.
AFAIR that was 3.10 kernel which is 4 years old now. I can repeat 
experiments with the newer kernels if there's interest. (I don't know what 
was changed there though)


On Monday, August 7, 2017 at 6:42:21 PM UTC+3, Peter Veentjer wrote:
>
> Hi Everyone,
>
> I'm failing to understand the problem with transparent huge pages.
>
> I 'understand' how normal pages work. A page is typically 4kb in a virtual 
> address space; each process has its own. 
>
> I understand how the TLB fits in; a cache providing a mapping of virtual 
> to real addresses to speed up address conversion.
>
> I understand that using a large page e.g. 2mb instead of a 4kb page can 
> reduce pressure on the TLB.
>
> So till so far it looks like huge large pages makes a lot of sense; of 
> course at the expensive of wasting memory if only a small section of a page 
> is being used. 
>
> The first part I don't understand is: why is it called transparent huge 
> pages? So what is transparent about it? 
>
> The second part I'm failing to understand is: why can it cause problems? 
> There are quite a few applications that recommend disabling THP and I 
> recently helped a customer that was helped by disabling it. It seems there 
> is more going on behind the scene's than having an increased page size. Is 
> it caused due to fragmentation? So if a new page is needed and memory is 
> fragmented (due to smaller pages); that small-pages need to be compacted 
> before a new huge page can be allocated? But if this would be the only 
> thing; this shouldn't be a problem once all pages for the application have 
> been touched and all pages are retained.
>
> So I'm probably missing something simple.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to