Manos Tsagkias created SPARK-32134: -------------------------------------- Summary: YARN: archives rename with # doesn't work for https Key: SPARK-32134 URL: https://issues.apache.org/jira/browse/SPARK-32134 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 2.3.0 Reporter: Manos Tsagkias
This is related to SPARK-10858 The YARN distributed cache feature with --archives where you can rename the archive using a # symbol does not work with the http(s) scheme: {{--archives http://mirror.sfo12.us.leaseweb.net/centos/6.10/isos/i386/sha1sum.txt#sha1sum}} This is because URLs can have fragments and therefore the # is interpreted as part of the fragment. We could use a similar trick as we do for the other two schemes file:// and hdfs:// in which first we remove the last fragment, parse the URL, and then reattach the fragment. The [code exists|https://github.com/apache/spark/pull/9035/files] but it is not applied to URLs. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org