Hi community,

I was testing Flink 1.17 on Kubernetes and ran into a strange class loading
problem. In short, the logs
show org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback was
loaded, however the program will throw ClassNotFoundException anyway.

The exception was thrown by Aliyun OSS Filesystem plugin lib. the log shows:

2023-04-17 11:29:54.269 INFO
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Shutting
KubernetesApplicationClusterEntrypoint down with application status FAILED.
Diagnostics org.apache.flink.util.FlinkException: Could not create the ha
services from the instantiated HighAvailabilityServicesFactory>
        at
org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:299)
        at
org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:285)
        at
org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:145)
        at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:439)
        at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:382)
        at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:282)
        at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:232)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
        at
org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:229)
        at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:729)
        at
org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.main(KubernetesApplicationClusterEntrypoint.java:86)
Caused by: java.io.IOException: Could not create FileSystem for highly
available storage path
(oss://octopus-flink-test/checkpoints/ha/state-machine-test)
        at
org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:102)
        at
org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:86)
        at
org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.createHAServices(KubernetesHaServicesFactory.java:41)
        at
org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:296)
        ... 13 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException:
java.lang.ClassNotFoundException: Class
org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback not found
        at
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2720)
        at
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.security.Groups.<init>(Groups.java:107)
        at
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.security.Groups.<init>(Groups.java:102)
        at
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:451)
        at
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:338)
        at
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:300)
        at
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:575)
        at
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem.initialize(AliyunOSSFileSystem.java:341)
        at
org.apache.flink.fs.osshadoop.OSSFileSystemFactory.create(OSSFileSystemFactory.java:103)
        at
org.apache.flink.core.fs.PluginFileSystemFactory.create(PluginFileSystemFactory.java:62)
        at
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:508)
        at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:409)
        at org.apache.flink.core.fs.Path.getFileSystem(Path.java:274)
        at
org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:99)
        ... 16 more

So I turned on -verbose:class to check whether the class file was loaded.
And I can see a class with similar name was loaded:

[Loaded
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback
from
file:/opt/flink/plugins/flink-oss-fs-hadoop/flink-oss-fs-hadoop-1.17.0.jar]

At first glance, I thought it was because the package name was changed
after shading. So I downloaded hadoop3-common jar and added it to
/opt/flink/lib. Then I can see that
org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback was loaded
too:

[Loaded org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback
from file:/opt/flink/lib/flink-shaded-hadoop2-uber-2.8.3-1.8.3.jar]

But the problem persists.

My dockerfile is:

FROM flink:1.17.0-java8
ADD --chown=flink:flink
https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop2-uber/2.8.3-1.8.3/flink-shaded-hadoop2-uber-2.8.3-1.8.3.jar
/opt/flink/lib/
ADD --chown=flink:flink
https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/3.3.5/hadoop-common-3.3.5.jar
/opt/flink/lib/
RUN mkdir /opt/flink/plugins/flink-oss-fs-hadoop/ && cp
/opt/flink/opt/flink-oss-fs-hadoop-1.17.0.jar
/opt/flink/plugins/flink-oss-fs-hadoop/

Does anyone have ideas why this problem occurs? Thanks!

Reply via email to