On 08/18/2017 07:46 AM, Mel Gorman wrote: > On Fri, Aug 18, 2017 at 02:20:38PM +0000, Liang, Kan wrote: >>> Nothing fancy other than needing a comment if it works. >>> >> >> No, the patch doesn't work. >> > > That indicates that it may be a hot page and it's possible that the page is > locked for a short time but waiters accumulate. What happens if you leave > NUMA balancing enabled but disable THP? Waiting on migration entries also > uses wait_on_page_locked so it would be interesting to know if the problem > is specific to THP. > > Can you tell me what this workload is doing? I want to see if it's something > like many threads pounding on a limited number of pages very quickly. If
It is a customer workload so we have limited visibility. But we believe there are some pages that are frequently accessed by all threads. > it's many threads working on private data, it would also be important to > know how each buffers threads are aligned, particularly if the buffers > are smaller than a THP or base page size. For example, if each thread is > operating on a base page sized buffer then disabling THP would side-step > the problem but THP would be false sharing between multiple threads. > Still, I don't think this problem is THP specific. If there is a hot regular page getting migrated, we'll also see many threads get queued up quickly. THP may have made the problem worse as migrating it takes a longer time, meaning more threads could get queued up. Thanks. Tim

