dosubot[bot] commented on issue #29915: URL: https://github.com/apache/superset/issues/29915#issuecomment-2283422296
<!-- Greeting --> @jelu Hello, I'm here to help you with any bugs or questions you have. I can also assist you in becoming a contributor. <!-- Answer --> To resolve the issue of a unique constraint on table names when importing an example dataset in Superset, you can modify the uniqueness constraint to include the `cluster_name` along with the `datasource_name`. This makes the constraint relative to the database/source instead of globally on just the table name. Here is the relevant part of the migration script that addresses this: ```python def upgrade(): bind = op.get_bind() insp = sa.engine.reflection.Inspector.from_engine(bind) # Add the new less restrictive uniqueness constraint. with op.batch_alter_table("datasources", naming_convention=conv) as batch_op: batch_op.create_unique_constraint( "uq_datasources_cluster_name", ["cluster_name", "datasource_name"] ) # Augment the tables which have a foreign key constraint related to the # datasources.datasource_name column. for foreign in ["columns", "metrics"]: with op.batch_alter_table(foreign, naming_convention=conv) as batch_op: # Add the datasource_id column with the relevant constraints. batch_op.add_column(sa.Column("datasource_id", sa.Integer)) batch_op.create_foreign_key( f"fk_{foreign}_datasource_id_datasources", "datasources", ["datasource_id"], ["id"], ) # Helper table for database migration using minimal schema. table = sa.Table( foreign, sa.MetaData(), sa.Column("id", sa.Integer, primary_key=True), sa.Column("datasource_name", sa.String(255)), sa.Column("datasource_id", sa.Integer), ) # Migrate the existing data. for datasource in bind.execute(datasources.select()): bind.execute( table.update() .where(table.c.datasource_name == datasource.datasource_name) .values(datasource_id=datasource.id) ) with op.batch_alter_table(foreign, naming_convention=conv) as batch_op: # Drop the datasource_name column and associated constraints. names = generic_find_fk_constraint_names( foreign, {"datasource_name"}, "datasources", insp ) for name in names: batch_op.drop_constraint( name or f"fk_{foreign}_datasource_name_datasources", type_="foreignkey", ) batch_op.drop_column("datasource_name") try: # Drop the old more restrictive uniqueness constraint. with op.batch_alter_table("datasources", naming_convention=conv) as batch_op: batch_op.drop_constraint( generic_find_uq_constraint_name( "datasources", {"datasource_name"}, insp ) or "uq_datasources_datasource_name", type_="unique", ) except Exception as ex: logging.warning( "Constraint drop failed, you may want to do this " "manually on your database. For context, this is a known " "issue around nondeterministic constraint names on Postgres " "and perhaps more databases through SQLAlchemy." ) logging.exception(ex) ``` Additionally, when loading example datasets such as the "birth_names" dataset, ensure that the table name is unique within the context of the database. The `load_data` function in the `superset/examples/birth_names.py` script handles the loading of data into the specified table name, ensuring it is unique within the database schema: ```python def load_data(tbl_name: str, database: Database, sample: bool = False) -> None: url = get_example_url("birth_names2.json.gz") pdf = pd.read_json(url, compression="gzip") if database.backend == "presto": pdf.ds = pd.to_datetime(pdf.ds, unit="ms") pdf.ds = pdf.ds.dt.strftime("%Y-%m-%d %H:%M:%S") else: pdf.ds = pd.to_datetime(pdf.ds, unit="ms") pdf = pdf.head(100) if sample else pdf with database.get_sqla_engine() as engine: schema = inspect(engine).default_schema_name pdf.to_sql( tbl_name, engine, schema=schema, if_exists="replace", chunksize=500, dtype={ "ds": DateTime if database.backend != "presto" else String(255), "gender": String(16), "state": String(10), "name": String(255), }, method="multi", index=False, ) print("Done loading table!") print("-" * 80) ``` This script ensures that the table name `tbl_name` is unique within the specified database schema, thus avoiding conflicts with other tables <sup>[[1]](https://github.com/apache/superset/blob/master/superset/migrations/versions/2017-10-03_14-37_4736ec66ce19_.py)</sup><sup>[[2]](https://github.com/apache/superset/blob/master/superset/migrations/versions/2015-09-21_17-30_4e6a06bad7a8_init.py)</sup><sup>[[3]](https://github.com/apache/superset/blob/master/superset/examples/birth_names.py)</sup>. <!-- Dosu Comment Footer --> *To continue talking to [Dosu](https://dosu.dev), mention @dosu.* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For additional commands, e-mail: notifications-h...@superset.apache.org