sadpandajoe opened a new pull request, #37557:
URL: https://github.com/apache/superset/pull/37557

   ### SUMMARY
   
   Fixes the example loading flow where datasets created by 
`load_parquet_table()` couldn't be matched by the import flow that looks up 
datasets by UUID.
   
   **Root cause:** The example loading has two separate code paths:
   1. **Data loading path** (`data_loading.py` → `generic_loader.py`): Creates 
`SqlaTable` without setting UUID
   2. **Config import path** (`load_examples_from_configs()`): Looks up 
existing datasets by UUID only
   
   When a dataset is created without UUID, the import flow can't find it and 
either fails or creates duplicates.
   
   **Solution:** Thread UUID from YAML configs through the data loading path:
   - Extract `uuid` field from YAML configs in `data_loading.py`
   - Pass UUID to `create_generic_loader()` and `load_parquet_table()`
   - Set UUID on new `SqlaTable` objects
   - Backfill UUID on existing datasets that have `uuid=None`
   - Use UUID-first lookup to avoid unique constraint violations
   - Include schema in lookups to prevent cross-schema collisions
   
   ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
   
   N/A - Backend-only change
   
   ### TESTING INSTRUCTIONS
   
   1. Run the new unit tests:
      ```bash
      pytest tests/unit_tests/examples/ -v
      ```
      All 32 tests should pass.
   
   2. Manual verification:
      ```bash
      superset load-examples --force
      ```
      - Dashboards should load with all charts working
      - No "dataset not found" errors in logs
      - Re-running should find existing datasets by UUID (no duplicates)
   
   ### ADDITIONAL INFORMATION
   
   - [ ] Has associated issue:
   - [ ] Required feature flags:
   - [ ] Changes UI
   - [ ] Includes DB Migration (follow approval process in 
[SIP-59](https://github.com/apache/superset/issues/13351))
     - [ ] Migration is atomic, supports rollback & is backwards-compatible
     - [ ] Confirm DB migration upgrade and downgrade tested
     - [ ] Runtime estimates and downtime expectations provided
   - [ ] Introduces new feature or API
   - [ ] Removes existing feature or API
   
   **Commits:**
   - `124a1b0510` - Preserve UUIDs from YAML configs in load_parquet_table
   - `cbff869c2d` - Extract _find_dataset helper for UUID-first lookup
   - `2a9292e4ad` - Add schema to _find_dataset lookup to prevent cross-schema 
collisions
   - `908359a1b5` - Set and backfill schema on SqlaTable creation
   - `8e20e38b9a` - Add comprehensive tests (32 total)
   
   🤖 Generated with [Claude Code](https://claude.ai/code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to