[ https://issues.apache.org/jira/browse/SPARK-14687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15247128#comment-15247128 ]
Liwei Lin commented on SPARK-14687: ----------------------------------- Updated with problem details. Thanks for the reminder! :-) > Call path.getFileSystem(conf) instead of call FileSystem.get(conf) > ------------------------------------------------------------------ > > Key: SPARK-14687 > URL: https://issues.apache.org/jira/browse/SPARK-14687 > Project: Spark > Issue Type: Improvement > Components: MLlib, Spark Core, SQL > Affects Versions: 2.0.0 > Reporter: Liwei Lin > Priority: Minor > > Generally we should call path.getFileSystem(conf) instead of call > FileSystem.get(conf), because the latter is actually called on the > DEFAULT_URI (fs.defaultFS), leading to problems under certain situations: > - if {{fs.defaultFS}} is {{hdfs://clusterA/...}}, but path is > {{hdfs://clusterB/...}}: then we'll encounter > {{java.lang.IllegalArgumentException (Wrong FS: hdfs://clusterB/..., > expected: hdfs://clusterA/...)}} > - if {{fs.defaultFS}} is not specified, the schema will default to > {{file:///}}: then we'll encounter {{java.lang.IllegalArgumentException > (Wrong FS: hdfs://..., expected: file:///)}} > - if {{fs.defaultFS}} is not {{hdfs://...}}, for example {{viewfs://}}(which > is used for federated HDFS): then we'll encounter > {{java.lang.IllegalArgumentException (Wrong FS: hdfs://..., expected: > viewfs:///)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org