On Mon 22-05-17 02:03:21, Guenter Roeck wrote: > On 05/22/2017 01:45 AM, Michal Hocko wrote: > >On Sat 20-05-17 09:26:34, Michal Hocko wrote: > >>On Fri 19-05-17 09:46:23, Guenter Roeck wrote: > >>>Hi, > >>> > >>>my qemu tests of next-20170519 show the following results: > >>> total: 122 pass: 30 fail: 92 > >>> > >>>I won't bother listing all of the failures; they are available at > >>>http://kerneltests.org/builders. I bisected one (openrisc, because > >>>it gives me some console output before dying). It points to > >>>'mm: drop HASH_ADAPT' as the culprit. Bisect log is attached. > >>> > >>>A quick glance suggests that 64 bit kernels pass and 32 bit kernels fail. > >>>32-bit x86 images fail and should provide an easy test case. > >> > >>Hmm, this is quite unexpected as the patch is not supposed to change > >>things much. It just removes the flag and perform the new hash scaling > >>automatically for all requeusts which do not have any high limit. > >>Some of those didn't have HASH_ADAPT before but that shouldn't change > >>the picture much. The only thing that I can imagine is that what > >>formerly failed for early memblock allocations is now suceeding and that > >>depletes the early memory. Do you have any serial console from the boot? > > > >OK, I guess I know what it going on here. Adaptive has scaling is not > >really suited for 32b. ADAPT_SCALE_BASE is just too large for the word > >size and so we end up in the endless loop. So the issue has been > >introduced by the original "mm: adaptive hash table scaling" but my > >patch made it more visible because [di]cache has tables most probably > >suceeded in the early initialization which didn't have HASH_ADAPT. > >The following should fix the hang. I am not yet sure about the maximum > >size for the scaling and something even smaller would make sense to me > >because kernel address space is just too small for such a large has > >tables. > >--- > >diff --git a/mm/page_alloc.c b/mm/page_alloc.c > >index a26e19c3e1ff..70c5fc1fb89a 100644 > >--- a/mm/page_alloc.c > >+++ b/mm/page_alloc.c > >@@ -7174,11 +7174,15 @@ static unsigned long __init > >arch_reserved_kernel_pages(void) > > /* > > * Adaptive scale is meant to reduce sizes of hash tables on large memory > > * machines. As memory size is increased the scale is also increased but at > >- * slower pace. Starting from ADAPT_SCALE_BASE (64G), every time memory > >- * quadruples the scale is increased by one, which means the size of hash > >table > >- * only doubles, instead of quadrupling as well. > >+ * slower pace. Starting from ADAPT_SCALE_BASE (64G on 64b systems and 32M > >+ * on 32b), every time memory quadruples the scale is increased by one, > >which > >+ * means the size of hash table only doubles, instead of quadrupling as > >well. > > */ > >+#if __BITS_PER_LONG == 64 > > #define ADAPT_SCALE_BASE (64ul << 30) > >+#else > >+#define ADAPT_SCALE_BASE (32ul << 20) > >+#endif > > #define ADAPT_SCALE_SHIFT 2 > > #define ADAPT_SCALE_NPAGES (ADAPT_SCALE_BASE >> PAGE_SHIFT) > > > I have seen another patch making it 64ull. Not sure if adaptive scaling > on 32 bit systems really makes sense; unless there is a clear need I'd rather > leave it alone.
I've just found out that my incoming emails sync didn't work since friday. So I've missed those follow up emails. I will double check. -- Michal Hocko SUSE Labs