Benjamin Teke created YARN-11813:
------------------------------------
Summary: YARN incorrectly falls back to cgroup v1 when cgroup v2
has v1 named subhierarchies
Key: YARN-11813
URL: https://issues.apache.org/jira/browse/YARN-11813
Project: Hadoop YARN
Issue Type: Sub-task
Reporter: Benjamin Teke
Assignee: Benjamin Teke
YARN-11743 introduced a fallback behaviour, where if a controller is not
mounted in v1, YARN tries to use it with v2. This is handled by the following
init step:
{code:java}
private static void initializeCGroupHandlers(Configuration conf,
CGroupsHandler.CGroupController controller) throws
ResourceHandlerException {
initializeCGroupV1Handler(conf);
if (cgroupsV2Enabled && !isMountedInCGroupsV1(controller)) {
initializeCGroupV2Handler(conf);
}
}
{code}
There is an issue with this when we're using preconfigured mount paths
(yarn.nodemanager.linux-container-executor.cgroups.mount-path): with
preconfigured mount paths the /etc/mtab files are no longer checked, hence if
there is a subhierarchy that's called the same as a v1 controller (e.g cpu,
memory, devices, etc) YARN will think it's mounted in v1 (without checking the
contents of the folder), and will try to update the v1 controller files on
application launch, causing application failures.
The reason for this is the !isMountedInCGroupsV1(controller) check and the fact
that v1 handler is initited first, and v2 is essentially used as a fallback. To
overcome this the order should be reversed, v1 should be the fallback handler.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]