[ https://issues.apache.org/jira/browse/SPARK-42837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-42837: --------------------------------- Component/s: Kubernetes > spark-submit - issue when resolving dependencies hosted on a private > repository in kubernetes cluster mode > ---------------------------------------------------------------------------------------------------------- > > Key: SPARK-42837 > URL: https://issues.apache.org/jira/browse/SPARK-42837 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Submit > Affects Versions: 3.3.2 > Reporter: lione Herbet > Priority: Minor > > When using [spark > operator|https://github.com/GoogleCloudPlatform/spark-on-k8s-operator], if > dependencies are hosted on a private repository with authentication needed > (like S3 or OCI) the spark operator submitting the job need to have all the > secrets to access all dependencies. If not the spark-submit fails. > On a multi tenant kubernetes cluster where the spark operator and spark jobs > execution are on seperate namespaces, it involves duplicating all secrets or > it won't work. > It seems that spark-submit need to acces dependencies (with credentials) only > to resolveGlobPath > ([https://github.com/apache/spark/blob/v3.3.2/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L364-L367)] > . It seems to me (but need to be confirmed by someone more skilled than me > on spark internals behavior) that this resolveGlobPath task is also done when > the driver is downloading the jars. > Would it be possible to have this resolveGlobPath task skipped when running > on a Kubernetes Cluster in cluster mode ? > For example add a condition like this arround the 364-367 lines : > {code:java} > if (isKubernetesCluster) { > ... > } {code} > We could even, for compatibility reason with old behavior if needed, add also > a condition on a spark parameter like this : > {code:java} > if (isKubernetesCluster && > sparkConf.getBoolean("spark.kubernetes.resolevGlobPathsInSubmit", true)) { > ... > }{code} > i tested both solution locally and it seems to resolve the case. > Do yout think I need to consider other elements ? > I may submit a patch depending on your feedback -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org