Thanks David for the quick response! *face palm* - Thanks a lot, that seems to have addressed the NullPointerException issue. May I also suggest that this [1] page be updated, since it says the key is " high-availability.cluster-id"
This led me to another issue however: "org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 's3'" The section [2] describe how I can either use environment variables e.g. ENABLE_BUILT_IN_PLUGINS or bake that in to the image by copying the provided plugins in opt/ under /plugins Dockerfile (snippet) # Configure flink provided plugin for S3 access RUN mkdir -p $FLINK_HOME/plugins/s3-fs-presto RUN cp $FLINK_HOME/opt/flink-s3-fs-presto-*.jar $FLINK_HOME/plugins/s3-fs-presto/ When bashing into the image: flink@dd86717a92a0:~/plugins/s3-fs-presto$ ls flink-s3-fs-presto-1.12.5.jar Any idea? [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#high-availability [2] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems/s3/#hadooppresto-s3-file-systems-plugins Den ons 25 aug. 2021 kl 08:00 skrev David Morávek <d...@apache.org>: > Hi Jonas, > > this exception is raised because "kubernetes.cluster-id" [1] is not set. > I'd also recommend setting "kubernetes.namespace" option, unless you're > using "default" namespace. > > I've filled FLINK-23961 [2] so we provide more descriptive warning for > this issue next time ;) > > [1] > https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/ha/kubernetes_ha/#example-configuration > [2] https://issues.apache.org/jira/browse/FLINK-23961 > > Best, > D. > > On Wed, Aug 25, 2021 at 8:48 AM jonas eyob <jonas.e...@gmail.com> wrote: > >> Hey, I've been struggling with this problem now for some days - driving >> me crazy. >> >> I have a standalone kubernetes Flink (1.12.5) using an application >> cluster mode approach. >> >> *The problem* >> I am getting a NullPointerException when specifying the FQN of the >> Kubernetes HA Service Factory class >> i.e. >> *org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory* >> >> What other configurations besides the ones specified (here >> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#common-cluster-resource-definitions>) >> may I be missing? >> >> Details: >> * we are using a custom image using the flink: 1.12 >> <https://hub.docker.com/layers/flink/library/flink/1.12/images/sha256-4b4290888e30d27a28517bac3b1678674cd4b17aa7b8329969d1d12fcdf68f02?context=explore> >> as base image >> >> flink-conf.yaml -- thought this may be useful? >> flink-conf.yaml: |+ >> jobmanager.rpc.address: {{ $fullName }}-jobmanager >> jobmanager.rpc.port: 6123 >> jobmanager.memory.process.size: 1600m >> taskmanager.numberOfTaskSlots: 2 >> taskmanager.rpc.port: 6122 >> taskmanager.memory.process.size: 1728m >> blob.server.port: 6124 >> queryable-state.proxy.ports: 6125 >> parallelism.default: 2 >> scheduler-mode: reactive >> execution.checkpointing.interval: 10s >> restart-strategy: fixed-delay >> restart-strategy.fixed-delay.attempts: 10 >> high-availability: >> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory >> high-availability.cluster-id: /{{ $fullName }} >> high-availability.storageDir: s3://redacted-flink-dev/recovery >> >> *Snippet of Job Manager pod log* >> 2021-08-25 06:14:20,652 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Shutting >> StandaloneApplicationClusterEntryPoint down with application status FAILED. >> Diagnostics org.apache.flink.util.FlinkException: Could not create the ha >> services from the instantiated HighAvailabilityServicesFactory >> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory. >> at >> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:268) >> at >> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:124) >> at >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:338) >> at >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:296) >> at >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224) >> at >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:178) >> at >> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) >> at >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:175) >> at >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:585) >> at >> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:85) >> Caused by: java.lang.NullPointerException >> at org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:59) >> at >> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.<init>(Fabric8FlinkKubeClient.java:85) >> at >> org.apache.flink.kubernetes.kubeclient.FlinkKubeClientFactory.fromConfiguration(FlinkKubeClientFactory.java:106) >> at >> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.createHAServices(KubernetesHaServicesFactory.java:37) >> at >> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:265) >> ... 9 more >> . >> >> -- >> Many thanks, >> Jonas >> > -- *Med Vänliga Hälsningar* *Jonas Eyob*