aminghadersohi opened a new pull request, #38576:
URL: https://github.com/apache/superset/pull/38576

   ### SUMMARY
   
   Add deduplication to the `cache_dashboard_thumbnail` Celery task to prevent 
concurrent Selenium sessions for the same dashboard.
   
   **Problem:** Multiple thumbnail tasks can be queued and run concurrently for 
the same dashboard because:
   1. The dashboard API set cache status to PENDING before enqueuing, so 
concurrent requests all see PENDING and each queues a new task
   2. The `update_thumbnail()` model method uses `force=True`, which bypasses 
all cache status checks
   3. No task-level deduplication existed — the only dedup was inside 
`compute_and_cache()`, which runs *after* the expensive Selenium session setup
   
   **Solution:**
   - **Task-level dedup**: Before doing expensive work, the task checks if 
another instance is already computing the same thumbnail (via cache status). If 
the status is COMPUTING and not stale, it skips. This works even for 
`force=True` calls.
   - **API-level race fix**: Set cache status to COMPUTING (not PENDING) before 
enqueuing the Celery task. Subsequent concurrent requests see COMPUTING and 
`should_trigger_task()` returns false, preventing duplicate enqueues.
   
   ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
   
   N/A - Backend-only change.
   
   ### TESTING INSTRUCTIONS
   
   1. Enable `THUMBNAILS` and `THUMBNAILS_SQLA_LISTENERS` feature flags
   2. Visit dashboard list or edit a dashboard — should trigger thumbnail 
generation
   3. Rapid concurrent requests to `/api/v1/dashboard/<id>/thumbnail/<digest>/` 
should result in only one Selenium task, not multiple
   4. Unit tests: `pytest tests/unit_tests/tasks/test_thumbnails.py -v`
   
   ### ADDITIONAL INFORMATION
   
   - [ ] Has associated issue:
   - [x] Required feature flags: `THUMBNAILS`
   - [ ] Changes UI
   - [ ] Includes DB Migration
   - [ ] Introduces new feature or API
   - [ ] Removes existing feature or API


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to