[ https://issues.apache.org/jira/browse/SPARK-46860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vikram Janarthanan updated SPARK-46860: --------------------------------------- Affects Version/s: 3.5.0 3.3.3 > Credentials with https url not working for --jars, --files, --archives & > --py-files options on spark-submit command > ------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-46860 > URL: https://issues.apache.org/jira/browse/SPARK-46860 > Project: Spark > Issue Type: Task > Components: k8s > Affects Versions: 3.3.3, 3.5.0, 3.3.4 > Environment: Spark 3.3.3 deployed on K8s > Reporter: Vikram Janarthanan > Priority: Major > > We are trying to run the spark application by pointing the dependent files as > well the main pyspark script from secure webserver > We are looking for solution to pass the dependencies as well as pysaprk > script from webserver. > we have tried deploying the spark application from webserver to k8s cluster > without username and password and it worked, but when tried with > username/password we are facing "Exception in thread "{*}main" > java.io.IOException: Server returned HTTP response code: 401 for URL: > https://username:passw...@domain.com/application/pysparkjob.py{*}" > *Working options on spark-submit:* > spark-submit ...... > --repositories https://username:passw...@domain.com/repo1/repo > --jars https://domain.com/jars/runtime.jar \ > --files https://domain.com/files/query.sql \ > --py-files [https://domain.com/pythonlib/pythonlib.zip] \ > https://domain.com/app1/pysparkapp.py > Note: only repositories option works with username and password > *Spark-submit using https url with username/password not working:* > spark-submit ...... > --jars https://username:passw...@domain.com/jars/runtime.jar \ > --files https://username:passw...@domain.com/files/query.sql \ > --py-files > https://username:passw...@domain.com[/pythonlib/pythonlib.zip|https://domain.com/pythonlib/pythonlib.zip] > \ > https://username:passw...@domain.com/app1/pysparkapp.py > > Error : > 25/01/23 09:19:57 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > Exception in thread "main" java.io.IOException: Server returned HTTP response > code: 401 for URL: > https://username:passw...@domain.com/repository/spark-artifacts/pysparkdemo/1.0/pysparkdemo-1.0.tgz > at > java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:2000) > at > java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1589) > at > java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:224) > at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:809) > at > org.apache.spark.util.DependencyUtils$.downloadFile(DependencyUtils.scala:264) > at > org.apache.spark.util.DependencyUtils$.$anonfun$downloadFileList$2(DependencyUtils.scala:233) > at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) > at > scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) > at > scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) > at > scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38) > at scala.collection.TraversableLike.map(TraversableLike.scala:286) > at scala.collection.TraversableLike.map$(TraversableLike.scala:279) > at scala.collection.AbstractTraversable.map(Traversable.scala:108) > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org