Vitor-Avila commented on code in PR #29802: URL: https://github.com/apache/superset/pull/29802#discussion_r1701959131
########## superset/db_engine_specs/databricks.py: ########## @@ -434,7 +434,25 @@ def get_default_catalog( cls, database: Database, ) -> str | None: + """ + Return the default catalog. + + The default behavior for Databricks is confusing. When Unity Catalog is not + enabled we have (the DB engine spec hasn't been tested with it enabled): + + > SHOW CATALOGS; + spark_catalog + > SELECT current_catalog(); + hive_metastore + + To handle permissions correctly we use the result of `SHOW CATALOGS` when a + single catalog is returned. + """ with database.get_sqla_engine() as engine: + catalogs = {catalog for (catalog,) in engine.execute("SHOW CATALOGS")} + if len(catalogs) == 1: + return catalogs.pop() Review Comment: if `SHOW CATALOGS` returns only `spark_catalog`, it seems we're returning it. Do we want to return `spark_catalog`, or do we need to return `hive_metastore` instead (when there's only `spark_catalog`)? I'm asking it because I think there are issues with other metadata calls if we try to use `spark_catalog` when the catalog selected in the form is `hive_metastore` (or if UC is not enabled). Also, if `multi_catalog` is disabled, do we want to rely on `SHOW CATALOGS`? Should we validate if `multi_catalog` is disabled first, and if so return the value of `current_catalog()` directly (which should be the catalog from the form)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For additional commands, e-mail: notifications-h...@superset.apache.org