michael-s-molina commented on PR #33720: URL: https://github.com/apache/superset/pull/33720#issuecomment-3078599846
@Quatters @mistercrunch @rusackas I executed the following migration to convert our existing MySQL database to be adherent to the changes in this PR. Step 1: Backup Your Database First ``` mysqldump -u your_username -p your_database_name > backup_$(date +%Y%m%d_%H%M%S).sql ``` Step 2: Check Current State ``` -- Check current database character set SHOW CREATE DATABASE your_database_name; -- Check all tables and their character sets SELECT TABLE_NAME, TABLE_COLLATION FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'your_database_name'; -- Check all columns with character sets SELECT TABLE_NAME, COLUMN_NAME, CHARACTER_SET_NAME, COLLATION_NAME, DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'your_database_name' AND CHARACTER_SET_NAME IS NOT NULL ORDER BY TABLE_NAME, COLUMN_NAME; ``` Step 3: Change Database Default ``` -- Change the database default character set ALTER DATABASE your_database_name DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci; ``` Step 4: Generate Conversion Commands for All Tables ``` -- Generate ALTER TABLE commands for all tables SELECT CONCAT( 'ALTER TABLE `', TABLE_NAME, '` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;' ) AS conversion_command FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'your_database_name' AND TABLE_TYPE = 'BASE TABLE' ORDER BY TABLE_NAME; ``` Step 5: Optional - Check Table Sizes Before Conversion ``` SELECT TABLE_NAME, TABLE_ROWS, ROUND(((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024), 2) AS 'Size_MB', ROUND(AVG_ROW_LENGTH) AS 'Avg_Row_Length' FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'your_database_name' AND TABLE_TYPE = 'BASE TABLE' ORDER BY (DATA_LENGTH + INDEX_LENGTH); ``` Step 6: Prepare for Conversion ``` -- Disable foreign key checks temporarily (if you have foreign keys) SET FOREIGN_KEY_CHECKS = 0; -- Optional: Increase timeouts for large tables SET SESSION innodb_lock_wait_timeout = 300; SET SESSION lock_wait_timeout = 300; ``` Step 7: Convert All Tables Replace your_database_name with your actual database name, then run the output from Step 4. Here's the template: ``` -- Example conversion commands (replace with your actual table names) ALTER TABLE `table1` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci; ALTER TABLE `table2` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci; ALTER TABLE `table3` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci; -- ... (continue for all tables) ``` Step 8: Re-enable Constraints ``` -- Re-enable foreign key checks SET FOREIGN_KEY_CHECKS = 1; ``` Step 9: Verify the Conversion ``` -- Verify database default SHOW CREATE DATABASE your_database_name; -- Verify all tables are converted SELECT TABLE_NAME, TABLE_COLLATION FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'your_database_name' AND TABLE_COLLATION != 'utf8mb4_0900_ai_ci'; -- Should return no rows if all tables are converted -- Verify all character columns are converted SELECT TABLE_NAME, COLUMN_NAME, CHARACTER_SET_NAME, COLLATION_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'your_database_name' AND CHARACTER_SET_NAME IS NOT NULL AND CHARACTER_SET_NAME != 'utf8mb4'; -- Should return no rows if all columns are converted ``` Some observations: - You can't simply change the database default character set. If you do that, the setting will only be applicable to new tables. - This means that you need to convert all tables and their data to the new setting. - The process does not take very long time (100k rows in 17 seconds) but does require downtime. @Quatters this needs to be explicitly called in `UPDATING.md`. - We also need to mention the required change for the connection string in `UPDATING.md` as admins will not read the Configuring Superset docs. - After the migration, characters like 大盘 are rendered correctly but emojis like 😀🌟 are rendered as ???????? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For additional commands, e-mail: notifications-h...@superset.apache.org