The answer is "no", I don't have root access, but I suspect that that would be
the right fix if it is currently set to [always] and either madvise or never
would be good options. If it is of interest, I'll ask someone to try it and
report back on what happens.
-Original Message-
From:
Only the one in brackets is set, others are unset alternatives.
If you write "madvise" in that file, it'll become "always [madvise] never".
Brice
Le 29/01/2019 à 15:36, Biddiscombe, John A. a écrit :
> On the 8 numa node machine
>
> $cat /sys/kernel/mm/transparent_hugepage/enabled
> [always]
On the 8 numa node machine
$cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
is set already, so I'm not really sure what should go in there to disable it.
JB
-Original Message-
From: Brice Goglin
Sent: 29 January 2019 15:29
To: Biddiscombe, John A. ; Hardware
Oh, that's very good to know. I guess lots of people using first touch
will be affected by this issue. We may want to add a hwloc memory flag
doing something similar.
Do you have root access to verify that writing "never" or "madvise" in
/sys/kernel/mm/transparent_hugepage/enabled fixes the issue
Brice
madvise(addr, n * sizeof(T), MADV_NOHUGEPAGE)
seems to make things behave much more sensibly. I had no idea it was a thing,
but one of my colleagues pointed me to it.
Problem seems to be solved for now. Thank you very much for your insights and
suggestions/help.
JB
-Original
I wondered something similar. The crazy patterns usually happen on columns of
the 2D matrix and as it is column major, it does loosely fit the idea (most of
the time).
I will play some more (though I'm fed up with it now).
JB
-Original Message-
From: Brice Goglin
Sent: 29 January
Crazy idea: 512 pages could be replaced with a single 2MB huge page.
You're not requesting huge pages in your allocation but some systems
have transparent huge pages enabled by default (e.g. RHEL
https://access.redhat.com/solutions/46111)
This could explain why 512 pages get allocated on the same
I simplified things and instead of writing to a 2D array, I allocate a 1D array
of bytes and touch pages in a linear fashion.
Then I call syscall(NR)move_pages, ) and retrieve a status array for each
page in the data.
When I allocate 511 pages and touch alternate pages on alternate numa
Thanks Gilles for this work around. And thanks to OpenMPI developpers
for this responsiveness to quickly correct the problem too.
I'll build and deploy this new version for the users as soon as I'm back
to the laboratory.
Patrick
Le 29/01/2019 à 06:48, Gilles Gouaillardet a écrit :
> Patrick,
>