[ 
https://issues.apache.org/jira/browse/SPARK-28895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-28895:
----------------------------------
        Parent:     (was: SPARK-33005)
    Issue Type: Bug  (was: Sub-task)

> Spark client process is unable to upload jars to hdfs while using ConfigMap 
> not HADOOP_CONF_DIR
> -----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-28895
>                 URL: https://issues.apache.org/jira/browse/SPARK-28895
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes, Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Kent Yao
>            Priority: Major
>
> The *BasicDriverFeatureStep* for Spark on Kubernetes will upload the 
> files/jars specified by --files/–jars to a hadoop compatible file system 
> configured by spark.kubernetes.file.upload.path. While using HADOOP_CONF_DIR, 
> the spark-submit process can recognize the file system, but when using 
> spark.kubernetes.hadoop.configMapName which only will be mount on the Pods 
> not applied back to our client process. 
>  
> ||Heading 1||Heading 2||
> |HADOOP_CONF_DIR=/path/to/etc/hadoop|OK|
> |spark.kubernetes.hadoop.configMapName=hz10-hadoop-dir |FAILED|
>  
> {code:java}
>  Kent@KentsMacBookPro  
> ~/Documents/spark-on-k8s/spark-3.0.0-SNAPSHOT-bin-2.7.3  bin/spark-submit 
> --conf spark.kubernetes.file.upload.path=hdfs://hz-cluster10/user/kyuubi/udf 
> --jars 
> /Users/Kent/Documents/spark-on-k8s/spark-3.0.0-SNAPSHOT-bin-2.7.3/hadoop-lzo-0.4.20-SNAPSHOT.jar
>  --conf spark.kerberos.keytab=/Users/Kent/Downloads/kyuubi.keytab --conf 
> spark.kerberos.principal=kyuubi/d...@hadoop.hz.netease.com --conf  
> spark.kubernetes.kerberos.krb5.path=/etc/krb5.conf  --name hehe --deploy-mode 
> cluster --class org.apache.spark.examples.HdfsTest   
> local:///opt/spark/examples/jars/spark-examples_2.12-3.0.0-SNAPSHOT.jar 
> hdfs://hz-cluster10/user/kyuubi/hive_db/kyuubi.db/hive_tbl
> Listening for transport dt_socket at address: 50014
> # spark.master=k8s://https://10.120.238.100:7443
> 19/08/27 17:21:06 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> 19/08/27 17:21:07 INFO SparkKubernetesClientFactory: Auto-configuring K8S 
> client using current context from users K8S config file
> Listening for transport dt_socket at address: 50014
> Exception in thread "main" org.apache.spark.SparkException: Uploading file 
> /Users/Kent/Documents/spark-on-k8s/spark-3.0.0-SNAPSHOT-bin-2.7.3/hadoop-lzo-0.4.20-SNAPSHOT.jar
>  failed...
>       at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:287)
>       at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:246)
>       at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
>       at 
> scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>       at 
> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>       at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>       at scala.collection.TraversableLike.map(TraversableLike.scala:237)
>       at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
>       at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>       at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:245)
>       at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatur#
>  spark.master=k8s://https://10.120.238.100:7443
> eStep.scala:165)
>       at scala.collection.immutable.List.foreach(List.scala:392)
>       at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:163)
>       at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$3(KubernetesDriverBuilder.scala:60)
>       at 
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
>       at 
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
>       at scala.collection.immutable.List.foldLeft(List.scala:89)
>       at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:58)
>       at 
> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:101)
>       at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$10(KubernetesClientApplication.scala:236)
>       at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$10$adapted(KubernetesClientApplication.scala:229)
>       at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2567)
>       at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:229)
>       at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:198)
>       at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:920)
>       at 
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:179)
>       at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:202)
>       at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:89)
>       at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:999)
>       at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1008)
>       at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: 
> hz-cluster10
>       at 
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
>       at 
> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310)
>       at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
>       at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)
>       at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
>       at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
>       at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
>       at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
>       at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
>       at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1881)
>       at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:278)
>       ... 30 more
> Caused by: java.net.UnknownHostException: hz-cluster10
>       ... 43 more
> {code}
> Other related spark configurations
> {code:java}
> spark.master=k8s://https://10.120.238.100:7443
> # spark.master=k8s://https://10.120.238.253:7443
> spark.kubernetes.container.image=harbor-inner.sparkonk8s.netease.com/tenant1-project1/spark:v3.0.0-20190813
> # 
> spark.kubernetes.driver.container.image=harbor-inner.sparkonk8s.netease.com/tenant1-project1/spark:v3.0.0-20190813
> # 
> spark.kubernetes.executor.container.image=harbor-inner.sparkonk8s.netease.com/tenant1-project1/spark:v3.0.0-20190813
> spark.executor.instanses=5
> spark.kubernetes.namespace=ns1
> spark.kubernetes.container.image.pullSecrets=mysecret
> spark.kubernetes.hadoop.configMapName=hz10-hadoop-dir
> spark.kubernetes.kerberos.krb5.path=/etc/krb5.conf
> spark.kerberos.principal=kyuubi/d...@hadoop.hz.netease.com
> spark.kerberos.keytab=/Users/Kent/Downloads/kyuubi.keytab
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to