Hi, folks!

I'd like to call a discussion on SIP-95: https://github.com/apache/superset/issues/22862

The proposal calls for a "catalog" selector in SQL Lab, where in this context a "catalog" is "a collection of schemas". If I remember correctly this is called:

- A "catalog" in Presto/Trino;
- A "database" in Postgres;
- A "project" in BigQuery.

I'd like to increase the scope of SIP-95 by introducing catalogs not only in SQL Lab, but throughout the whole application. For example, when adding a dataset the user would be able to choose a database, then a catalog, then a schema, and finally a table.

Last year, while working on security issues related to schemas, I started adding the foundational support for catalogs in Superset. DB engine specs already have these attributes:

- supports_catalog
- supports_dynamic_catalog (can the catalog be changed on a per-query basis?)
- get_catalog_names()

Additionally, many of the methods in the DB engine specs already have `catalog` in their signatures, together with `schema`.

Note that one of the biggest challenge in supporting catalogs is that the SQLAlchemy URI needs to be modified depending on the selected catalog. This will have to be done via `adjust_engine_params` in each DB engine spec that we want to support, but the base implementation is there already.

The remaining work includes:

1. Refactoring the data permissions to include catalogs, similar to how it works today for databases and schemas. 2. Introducing UI inputs for selecting catalogs when creating a dataset or in SQL Lab.

For (2), the work needed overlaps with SIP-111 ("Proposal for improved database, schema, and table selection UI in SQL Lab sidebar", https://github.com/apache/superset/issues/26395), which hasn't been officially discussed yet.

Thanks,
--Beto

Reply via email to