Github user srowen commented on the issue: https://github.com/apache/spark/pull/15382 Update on my thinking: Docs say the default should be "spark-warehouse" in the local working directory. The original code did establish a default on the local file system... though in the home dir, not working dir. So I still favor fixing that, but that's a small, orthogonal question. Although the default appears to be intended to be on the local FS, I don't think it must be. @avulanov mentions it might be on S3; the Hive warehouse dir is on HDFS too by default, in comparison. I don't have evidence that this has been prohibited so I guess we shouldn't assume this is a local path. It seems simple to default to "spark-warehouse" in the local working dir of whatever file system is being used, be it local or HDFS. That means SQLConf shouldn't specify a default with any URI scheme or path. SQLConf should then resolve it, though, at the moment there's no good way to plumb through a Hadoop Configuration. (I just made a new default one in here for the moment.) Then we're back to an interesting detail, that Hadoop Path's toURI won't add a trailing slash in the case of a local file. Maybe it doesn't in general since it can't always tell whether a path is a directory or not. Local URIs however will have a trailing slash. It's probably best to leave the behavior and adjust the tests to not expect a trailing slash, as the original code seemed to (?) Does this still work on Windows? As far as I can tell it would. Is it OK to instantiate `FileSystem.get(new Configuration())` in `SQLConf`? there's no easy access to the actual conf, but, it is only trying to read the default file system which shouldn't depend on Spark-specific Hadoop conf. I think this change actually _restores_ the ability to specify a warehouse path with a URI scheme. This second, it will be interpreted as a file path only if you write "file:/foo". However there's no way to write a relative file URI, so no way to guarantee that a default of "spark-warehouse" is a relative path on the local FS. After this change, HDFS users would see that this correctly defaults to "spark-warehouse" in an HDFS working directory.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org