michael-s-molina commented on PR #33720:
URL: https://github.com/apache/superset/pull/33720#issuecomment-3078599846

   @Quatters @mistercrunch @rusackas I executed the following migration to 
convert our existing MySQL database to be adherent to the changes in this PR.
   
   Step 1: Backup Your Database First
   
   ```
   mysqldump -u your_username -p your_database_name > backup_$(date 
+%Y%m%d_%H%M%S).sql
   ```
   
   Step 2: Check Current State
   
   ```
   -- Check current database character set
   SHOW CREATE DATABASE your_database_name;
   
   -- Check all tables and their character sets
   SELECT TABLE_NAME, TABLE_COLLATION 
   FROM INFORMATION_SCHEMA.TABLES 
   WHERE TABLE_SCHEMA = 'your_database_name';
   
   -- Check all columns with character sets
   SELECT TABLE_NAME, COLUMN_NAME, CHARACTER_SET_NAME, COLLATION_NAME, DATA_TYPE
   FROM INFORMATION_SCHEMA.COLUMNS 
   WHERE TABLE_SCHEMA = 'your_database_name' 
   AND CHARACTER_SET_NAME IS NOT NULL
   ORDER BY TABLE_NAME, COLUMN_NAME;
   ```
   
   Step 3: Change Database Default
   
   ```
   -- Change the database default character set
   ALTER DATABASE your_database_name 
   DEFAULT CHARACTER SET utf8mb4 
   COLLATE utf8mb4_0900_ai_ci;
   ```
   
   Step 4: Generate Conversion Commands for All Tables
   
   ```
   -- Generate ALTER TABLE commands for all tables
   SELECT CONCAT(
       'ALTER TABLE `', TABLE_NAME, 
       '` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;'
   ) AS conversion_command
   FROM INFORMATION_SCHEMA.TABLES 
   WHERE TABLE_SCHEMA = 'your_database_name' 
   AND TABLE_TYPE = 'BASE TABLE'
   ORDER BY TABLE_NAME;
   ```
   
   Step 5: Optional - Check Table Sizes Before Conversion
   
   ```
   SELECT 
       TABLE_NAME,
       TABLE_ROWS,
       ROUND(((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024), 2) AS 'Size_MB',
       ROUND(AVG_ROW_LENGTH) AS 'Avg_Row_Length'
   FROM INFORMATION_SCHEMA.TABLES 
   WHERE TABLE_SCHEMA = 'your_database_name' 
   AND TABLE_TYPE = 'BASE TABLE'
   ORDER BY (DATA_LENGTH + INDEX_LENGTH);
   ```
   
   Step 6: Prepare for Conversion
   
   ```
   -- Disable foreign key checks temporarily (if you have foreign keys)
   SET FOREIGN_KEY_CHECKS = 0;
   
   -- Optional: Increase timeouts for large tables
   SET SESSION innodb_lock_wait_timeout = 300;
   SET SESSION lock_wait_timeout = 300;
   ```
   
   Step 7: Convert All Tables
   Replace your_database_name with your actual database name, then run the 
output from Step 4. Here's the template:
   
   ```
   -- Example conversion commands (replace with your actual table names)
   ALTER TABLE `table1` CONVERT TO CHARACTER SET utf8mb4 COLLATE 
utf8mb4_0900_ai_ci;
   ALTER TABLE `table2` CONVERT TO CHARACTER SET utf8mb4 COLLATE 
utf8mb4_0900_ai_ci;
   ALTER TABLE `table3` CONVERT TO CHARACTER SET utf8mb4 COLLATE 
utf8mb4_0900_ai_ci;
   -- ... (continue for all tables)
   ```
   
   Step 8: Re-enable Constraints
   
   ```
   -- Re-enable foreign key checks
   SET FOREIGN_KEY_CHECKS = 1;
   ```
   
   Step 9: Verify the Conversion
   
   ```
   -- Verify database default
   SHOW CREATE DATABASE your_database_name;
   
   -- Verify all tables are converted
   SELECT TABLE_NAME, TABLE_COLLATION 
   FROM INFORMATION_SCHEMA.TABLES 
   WHERE TABLE_SCHEMA = 'your_database_name'
   AND TABLE_COLLATION != 'utf8mb4_0900_ai_ci';
   
   -- Should return no rows if all tables are converted
   
   -- Verify all character columns are converted
   SELECT TABLE_NAME, COLUMN_NAME, CHARACTER_SET_NAME, COLLATION_NAME
   FROM INFORMATION_SCHEMA.COLUMNS 
   WHERE TABLE_SCHEMA = 'your_database_name' 
   AND CHARACTER_SET_NAME IS NOT NULL
   AND CHARACTER_SET_NAME != 'utf8mb4';
   
   -- Should return no rows if all columns are converted
   ```
   
   Some observations:
   - You can't simply change the database default character set. If you do 
that, the setting will only be applicable to new tables.
   - This means that you need to convert all tables and their data to the new 
setting.
   - The process does not take very long time (100k rows in 17 seconds) but 
does require downtime. @Quatters this needs to be explicitly called in 
`UPDATING.md`.
   - We also need to mention the required change for the connection string in 
`UPDATING.md` as admins will not read the Configuring Superset docs.
   - After the migration, characters like 大盘 are rendered correctly but emojis 
like 😀🌟 are rendered as ???????? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org
For additional commands, e-mail: notifications-h...@superset.apache.org

Reply via email to