Hi, I have a pod on openshift 4.6 running a jupyter notebook with spark 3.1.1 and python 3.7 (based on open data hub, tweaked the dockerfile because I wanted this specific python version).
I'm trying to run spark in client mode using the image of google's spark operator (gcr.io/spark-operator/spark-py:v3.1.1), spark runs fine but I'm unable to connect to a kerberized cloudera hdfs, I've tried the examples outlined in the security documentation (https://github.com/apache/spark/blob/master/docs/security.md#secure-interaction-with-kubernetes) and numerous other combinations but nothing seems to work. I managed to authenticate with kerberos by passing additional java parameters to the driver and executors (-Djava.security.krb5.conf), and passing the kerberos config to the executors using the configmap auto generated from the folder which SPARK_CONF points to on the driver, I'll try to pass the hadoop configuration files like that as well and set the hadoop home just to test the connection. I don't want to use that solution in prod, `spark.kubernetes.kerberos.krb5.configMapName` and `spark.kubernetes.hadoop.configMapName` don't seem to do anything, the pod spec of the executors doesn't have those volumes, I'm using `spark.kubernetes.authenticate.oauthToken` to authenticate with k8s and I'm using a user who is a clusteradmin. I also don't want to get a delegation token, figured I can just use the keytab even though the examples in the security documentation don't mention using a keytab with the configmaps. The configuration I'm trying to use: spark.kubernetes.authenticate.oauthToken with the oauth token of a cluster admin. spark.kubernetes.hadoop.configMapName pointing to a configmap containing the core-site.xml and hdfs-site.xml I got from the cloudera manager spark.kubernetes.kerberos.krb5.configMapName pointing to a configmap containing a krb5.conf spark.kerberos.keytab spark.kerberos.principal Thanks, Gal --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org