Hi everyone, On spark 2.2.0, if you wanted to create a custom file system implementation, you just created an extension of org.apache.hadoop.fs.FileSystem and put the canonical name of the custom class on the file src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem.
Once you imported that jar dependency on your spark submit application, the custom schema was automatically loaded, and you could start to use it just like ds.load("customfs://path"). But on spark 2.4.0 that does not seem to work the same. If you do exactly the same you will get an error like "No FileSystem for customfs". The only way I achieved this on 2.4.0, was specifying the spark property spark.hadoop.fs.customfs.impl. Do you guys consider this as a bug? or is it an intentional change that should be documented on somewhere? Btw, digging a little bit on this, it seems that the cause is that now the FileSystem is initialized before the actual dependencies are downloaded from Maven repo (see here <https://github.com/apache/spark/blob/v2.4.0/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala#L66>). And as that initialization loads the available filesystems at that point and only once, the filesystems in the jars downloaded are not taken in account. Thanks.