[ https://issues.apache.org/jira/browse/SPARK-22585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16264234#comment-16264234 ]
Jakub Dubovsky commented on SPARK-22585: ---------------------------------------- I am not sure about adding encoding step into implementation of addJar method. It's not about encoding whole path as a string since you want to keep some characters literally (':', '/' possibly others). So the code would first need to parse the path to get only path segments and encode those. This most probably leads to using URI again at which point this starts to be circular problem. Moreover I am not sure what is the point of encoding path segments only to ask URI to decode it... I also think that it makes sense to decode the segment only inside of a logic accessing a value of that segment. If I work with url/path as a whole I want to keep it parsable and therefore keep special characters encoded. This is the thinking I would personally use to decide which version (getPath/getRawPath) should be used in particular scenarios across spark code base even though I must admit I have very little insight into these other URI usecases :) > Url encoding of jar path expected? > ---------------------------------- > > Key: SPARK-22585 > URL: https://issues.apache.org/jira/browse/SPARK-22585 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.2.0 > Reporter: Jakub Dubovsky > > I am calling {code}sparkContext.addJar{code} method with path to a local jar > I want to add. Example: > {code}/home/me/.coursier/cache/v1/https/artifactory.com%3A443/path/to.jar{code}. > As a result I get an exception saying > {code} > Failed to add > /home/me/.coursier/cache/v1/https/artifactory.com%3A443/path/to.jar to Spark > environment. Stacktrace: > java.io.FileNotFoundException: Jar > /home/me/.coursier/cache/v1/https/artifactory.com:443/path/to.jar not found > {code} > Important part to notice here is that colon character is url encoded in path > I want to use but exception is complaining about path in decoded form. This > is caused by this line of code from implementation ([see > here|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/SparkContext.scala#L1833]): > {code} > case null | "file" => addJarFile(new File(uri.getPath)) > {code} > It uses > [getPath|https://docs.oracle.com/javase/7/docs/api/java/net/URI.html#getPath()] > method of > [java.net.URI|https://docs.oracle.com/javase/7/docs/api/java/net/URI.html] > which url decodes the path. I believe method > [getRawPath|https://docs.oracle.com/javase/7/docs/api/java/net/URI.html#getRawPath()] > should be used here which keeps path string in original form. > I tend to see this as a bug since I want to use my dependencies resolved from > artifactory with port directly. Is there some specific reason for this or can > we fix this? > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org