aminghadersohi opened a new pull request, #38576: URL: https://github.com/apache/superset/pull/38576
### SUMMARY Add deduplication to the `cache_dashboard_thumbnail` Celery task to prevent concurrent Selenium sessions for the same dashboard. **Problem:** Multiple thumbnail tasks can be queued and run concurrently for the same dashboard because: 1. The dashboard API set cache status to PENDING before enqueuing, so concurrent requests all see PENDING and each queues a new task 2. The `update_thumbnail()` model method uses `force=True`, which bypasses all cache status checks 3. No task-level deduplication existed — the only dedup was inside `compute_and_cache()`, which runs *after* the expensive Selenium session setup **Solution:** - **Task-level dedup**: Before doing expensive work, the task checks if another instance is already computing the same thumbnail (via cache status). If the status is COMPUTING and not stale, it skips. This works even for `force=True` calls. - **API-level race fix**: Set cache status to COMPUTING (not PENDING) before enqueuing the Celery task. Subsequent concurrent requests see COMPUTING and `should_trigger_task()` returns false, preventing duplicate enqueues. ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF N/A - Backend-only change. ### TESTING INSTRUCTIONS 1. Enable `THUMBNAILS` and `THUMBNAILS_SQLA_LISTENERS` feature flags 2. Visit dashboard list or edit a dashboard — should trigger thumbnail generation 3. Rapid concurrent requests to `/api/v1/dashboard/<id>/thumbnail/<digest>/` should result in only one Selenium task, not multiple 4. Unit tests: `pytest tests/unit_tests/tasks/test_thumbnails.py -v` ### ADDITIONAL INFORMATION - [ ] Has associated issue: - [x] Required feature flags: `THUMBNAILS` - [ ] Changes UI - [ ] Includes DB Migration - [ ] Introduces new feature or API - [ ] Removes existing feature or API -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
