Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/19885
  
    @vanzin its too late for this, but I don't see any reason why 
`FileSystem.getCanonicalUri` should be kept protected. If someone wants to 
volunteer with the spec changes to filesystem.md & contract tests, they'll get 
support.
    
    Looking at what HDFS does there, it calls out HA support as special: you 
can't do DNS resolution
    ```java
      protected URI canonicalizeUri(URI uri) {
        if (HAUtilClient.isLogicalUri(getConf(), uri)) {
          // Don't try to DNS-resolve logical URIs, since the 'authority'
          // portion isn't a proper hostname
          return uri;
        } else {
          return NetUtils.getCanonicalUri(uri, getDefaultPort());
        }
      }
    ```
    
    where `NetUtils.getCanonicalUri()` does some DNS lookup with caching of 
previously canonicalized hosts via {{SecurityUtil.getByName}}. SecurityUtil is 
tagged as `@Public`; NetUtils isn't, but that could be relaxed while nobody is 
looking. But it doesn't address the big issue: different filesystems clearly 
have different rules about "canonical", and you don't want to try and work them 
out and re-replicate, as it is a moving-maintenance-target.
    
    I'm stuck at this point. Created 
[HADOOP-15094](https://issues.apache.org/jira/browse/HADOOP-15094). 
    Looking at {{Filesystem.CACHE}}; that compares on: (scheme, authority, 
ugi), so it will actually return different FS instances for unqualified and 
qualified hosts. Maybe for this specific problem it's simplest to say "if you 
do that, don't expect things to work"


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to