Jan Van Besien created MAPREDUCE-7540:
-----------------------------------------
Summary: JDK24 subject propagation case missed in
CommitterEventHandler
Key: MAPREDUCE-7540
URL: https://issues.apache.org/jira/browse/MAPREDUCE-7540
Project: Hadoop Map/Reduce
Issue Type: Bug
Reporter: Jan Van Besien
HADOOP-19670 converted {{CommitterEventHandler}} to use
{{{}SubjectInheritingThread{}}}, but the backing {{ThreadFactory}} that creates
them is only installed when the job classloader is enabled:
{code:java}
// CommitterEventHandler.serviceStart()
ThreadFactoryBuilder tfBuilder = new ThreadFactoryBuilder()
.setNameFormat("CommitterEvent Processor #%d");
if (jobClassLoader != null) {
// if the job classloader is enabled, we need to use the job classloader
// as the thread context classloader (TCCL) of these threads in case the
// committer needs to load another class via TCCL
ThreadFactory backingTf = new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread thread = new SubjectInheritingThread(r);
thread.setContextClassLoader(jobClassLoader);
return thread;
}
};
tfBuilder.setThreadFactory(backingTf);
}
ThreadFactory tf = tfBuilder.build();
{code}
When {{mapreduce.job.classloader}} is not enabled,
{{ThreadFactoryBuilder.build()}} falls back to
{{{}Executors.defaultThreadFactory(){}}}, so the "CommitterEvent Processor"
pool threads are plain \{{Thread}}s.
On JDK 24+ (SecurityManager permanently disabled, JEP 486) plain threads no
longer inherit the Subject bound by {{{}Subject.callAs{}}}. So inside
{{{}CommitterEventHandler$EventProcessor{}}},
{{UserGroupInformation.getCurrentUser()}} finds no Subject and falls back to
the login user. As a result {{OutputCommitter.setupJob()}} / {{commitJob()}}
perform their HDFS operations as the NodeManager's OS user (e.g. {{{}yarn{}}})
instead of the job submitter.
Observed on Hadoop 3.5.0 running on JDK 25 (see stack trace below).
Note the asymmetry in the stack trace: the NameNode side shows the patched
{{SubjectInheritingThread}} frames from HADOOP-19668, while the AM side bottoms
out in a bare {{Thread.run}} with no Subject wrapper frames.
{noformat}
org.apache.hadoop.security.AccessControlException: Permission denied:
user=yarn, access=WRITE,
inode="/tmp/lily/crunch/crunch-1246119727/p27":airflow:lily:drwxrwxr-x
at
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:663)
at
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:503)
at
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermissionWithContext(FSPermissionChecker.java:527)
at
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:397)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1974)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1955)
at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1914)
at
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:62)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3568)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1176)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideT
ranslatorPB.java:742)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNa
menodeProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:631)
at
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:599)
at
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:583)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1228)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1312)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1227)
at
java.base/jdk.internal.vm.ScopedValueContainer.callWithoutScope(ScopedValueContainer.java:162)
at
java.base/jdk.internal.vm.ScopedValueContainer.call(ScopedValueContainer.java:147)
at java.base/java.lang.ScopedValue$Carrier.call(ScopedValue.java:419)
at java.base/javax.security.auth.Subject.callAs(Subject.java:331)
at
org.apache.hadoop.security.authentication.util.SubjectUtil.callAs(SubjectUtil.java:242)
at
org.apache.hadoop.security.authentication.util.SubjectUtil.doAs(SubjectUtil.java:314)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1958)
at org.apache.hadoop.ipc.Server$Handler.work(Server.java:3279)
at
org.apache.hadoop.util.concurrent.SubjectInheritingThread$1.run(SubjectInheritingThread.java:203)
at
org.apache.hadoop.util.concurrent.SubjectInheritingThread$1.run(SubjectInheritingThread.java:199)
at
java.base/jdk.internal.vm.ScopedValueContainer.callWithoutScope(ScopedValueContainer.java:162)
at
java.base/jdk.internal.vm.ScopedValueContainer.call(ScopedValueContainer.java:147)
at java.base/java.lang.ScopedValue$Carrier.call(ScopedValue.java:419)
at java.base/javax.security.auth.Subject.callAs(Subject.java:331)
at
org.apache.hadoop.security.authentication.util.SubjectUtil.callAs(SubjectUtil.java:242)
at
org.apache.hadoop.security.authentication.util.SubjectUtil.doAs(SubjectUtil.java:275)
at
org.apache.hadoop.util.concurrent.SubjectInheritingThread.run(SubjectInheritingThread.java:199)
at
java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
at
java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
at
java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:483)
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2557)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2531)
at
org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1507)
at
org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1504)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1521)
at
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:1496)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2497)
at
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:361)
at
org.apache.crunch.io.CrunchOutputs$CompositeOutputCommitter.setupJob(CrunchOutputs.java:300)
at
org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobSetup(CommitterEventHandler.java:256)
at
org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:236)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1090)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:614)
at java.base/java.lang.Thread.run(Thread.java:1474)
{noformat}
{*}fix{*}: always install the backing factory; only the classloader assignment
should be conditional:
{code:java}
ThreadFactory backingTf = new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread thread = new SubjectInheritingThread(r);
if (jobClassLoader != null) {
thread.setContextClassLoader(jobClassLoader);
}
return thread;
}
};
tfBuilder.setThreadFactory(backingTf);
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]