[ 
https://issues.apache.org/jira/browse/SPARK-22585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16264234#comment-16264234
 ] 

Jakub Dubovsky commented on SPARK-22585:
----------------------------------------

I am not sure about adding encoding step into implementation of addJar method. 
It's not about encoding whole path as a string since you want to keep some 
characters literally (':', '/' possibly others). So the code would first need 
to parse the path to get only path segments and encode those. This most 
probably leads to using URI again at which point this starts to be circular 
problem. Moreover I am not sure what is the point of encoding path segments 
only to ask URI to decode it...

I also think that it makes sense to decode the segment only inside of a logic 
accessing a value of that segment. If I work with url/path as a whole I want to 
keep it parsable and therefore keep special characters encoded. This is the 
thinking I would personally use to decide which version (getPath/getRawPath) 
should be used in particular scenarios across spark code base even though I 
must admit I have very little insight into these other URI usecases :)

> Url encoding of jar path expected?
> ----------------------------------
>
>                 Key: SPARK-22585
>                 URL: https://issues.apache.org/jira/browse/SPARK-22585
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Jakub Dubovsky
>
> I am calling {code}sparkContext.addJar{code} method with path to a local jar 
> I want to add. Example:
> {code}/home/me/.coursier/cache/v1/https/artifactory.com%3A443/path/to.jar{code}.
>  As a result I get an exception saying
> {code}
> Failed to add 
> /home/me/.coursier/cache/v1/https/artifactory.com%3A443/path/to.jar to Spark 
> environment. Stacktrace:
> java.io.FileNotFoundException: Jar 
> /home/me/.coursier/cache/v1/https/artifactory.com:443/path/to.jar not found
> {code}
> Important part to notice here is that colon character is url encoded in path 
> I want to use but exception is complaining about path in decoded form. This 
> is caused by this line of code from implementation ([see 
> here|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/SparkContext.scala#L1833]):
> {code}
> case null | "file" => addJarFile(new File(uri.getPath))
> {code}
> It uses 
> [getPath|https://docs.oracle.com/javase/7/docs/api/java/net/URI.html#getPath()]
>  method of 
> [java.net.URI|https://docs.oracle.com/javase/7/docs/api/java/net/URI.html] 
> which url decodes the path. I believe method 
> [getRawPath|https://docs.oracle.com/javase/7/docs/api/java/net/URI.html#getRawPath()]
>  should be used here which keeps path string in original form.
> I tend to see this as a bug since I want to use my dependencies resolved from 
> artifactory with port directly. Is there some specific reason for this or can 
> we fix this?
> Thanks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to