[ https://issues.apache.org/jira/browse/SPARK-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158725#comment-14158725 ]
Apache Spark commented on SPARK-3788: ------------------------------------- User 'vanzin' has created a pull request for this issue: https://github.com/apache/spark/pull/2650 > Yarn dist cache code is not friendly to HDFS HA, Federation > ----------------------------------------------------------- > > Key: SPARK-3788 > URL: https://issues.apache.org/jira/browse/SPARK-3788 > Project: Spark > Issue Type: Bug > Components: YARN > Reporter: Marcelo Vanzin > Assignee: Marcelo Vanzin > > There are two bugs here. > 1. The {{compareFs()}} method in ClientBase considers the 'host' part of the > URI to be an actual host. In the case of HA and Federation, that's a > namespace name, which doesn't resolve to anything. So in those cases, > {{compareFs()}} always says the file systems are different. > 2. In {{prepareLocalResources()}}, when adding a file to the distributed > cache, that is done with the common FileSystem object instantiated at the > start of the method. In the case of Federation that doesn't work: the > qualified URL's scheme may differ from the non-qualified one, so the > FileSystem instance will not work. > Fixes are pretty trivial. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org