korbit-ai[bot] commented on code in PR #35595:
URL: https://github.com/apache/superset/pull/35595#discussion_r2418672894
##########
superset/tasks/scheduler.py:
##########
@@ -40,8 +41,18 @@
logger = logging.getLogger(__name__)
+@task_failure.connect
+def log_task_failure(sender=None, task_id=None, exception=None, args=None,
kwargs=None, traceback=None, einfo=None, **kw):
+ logger.exception(f"Celery task {sender.name} failed: {exception}",
exc_info=einfo)
-@celery_app.task(name="reports.scheduler")
+
+@celery_app.task(
+ name="reports.scheduler",
+ bind=True,
+ autoretry_for=(Exception,),
+ retry_kwargs={"max_retries": 3, "countdown": 60}, # Retry up to 3 times,
wait 60s between
+ retry_backoff=True, # exponential backoff
Review Comment:
### Overly broad exception retry policy wastes resources <sub></sub>
<details>
<summary>Tell me more</summary>
###### What is the issue?
Retrying on all exceptions including non-transient errors will cause
unnecessary retry attempts for permanent failures.
###### Why this matters
This configuration will retry programming errors, configuration issues, and
other permanent failures that cannot be resolved by retrying, wasting
computational resources and delaying error detection.
###### Suggested change ∙ *Feature Preview*
Specify only transient exceptions that benefit from retrying:
```python
autoretry_for=(ConnectionError, TimeoutError, SoftTimeLimitExceeded),
```
Or exclude permanent error types:
```python
autoretry_for=(Exception,),
retry_kwargs={"max_retries": 3, "countdown": 60},
retry_backoff=True,
dont_autoretry_for=(CommandException, ValueError, TypeError),
```
###### Provide feedback to improve future suggestions
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/6e023861-c15d-4798-b8a5-1f74f5974348/upvote)
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/6e023861-c15d-4798-b8a5-1f74f5974348?what_not_true=true)
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/6e023861-c15d-4798-b8a5-1f74f5974348?what_out_of_scope=true)
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/6e023861-c15d-4798-b8a5-1f74f5974348?what_not_in_standard=true)
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/6e023861-c15d-4798-b8a5-1f74f5974348)
</details>
<sub>
💬 Looking for more details? Reply to this comment to chat with Korbit.
</sub>
<!--- korbi internal id:c211b467-c306-47f2-bad4-c086109dd27c -->
[](c211b467-c306-47f2-bad4-c086109dd27c)
##########
superset/tasks/scheduler.py:
##########
@@ -40,8 +41,18 @@
logger = logging.getLogger(__name__)
+@task_failure.connect
+def log_task_failure(sender=None, task_id=None, exception=None, args=None,
kwargs=None, traceback=None, einfo=None, **kw):
+ logger.exception(f"Celery task {sender.name} failed: {exception}",
exc_info=einfo)
Review Comment:
### Global task failure signal handler creates excessive logging overhead
<sub></sub>
<details>
<summary>Tell me more</summary>
###### What is the issue?
The signal handler will be triggered for ALL Celery task failures across the
entire application, not just tasks in this module, creating unnecessary logging
overhead.
###### Why this matters
This global signal handler will process every task failure in the Celery
worker, including unrelated tasks from other modules, leading to excessive
logging and potential performance degradation in high-throughput environments.
###### Suggested change ∙ *Feature Preview*
Move the signal handler to a more appropriate location like application
initialization, or make it more selective by checking the task name/module
before logging:
```python
@task_failure.connect
def log_task_failure(sender=None, task_id=None, exception=None, args=None,
kwargs=None, traceback=None, einfo=None, **kw):
if sender and sender.name.startswith("reports."):
logger.exception(f"Celery task {sender.name} failed: {exception}",
exc_info=einfo)
```
###### Provide feedback to improve future suggestions
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/801294fc-3723-4231-b7f8-6a3e650ab77c/upvote)
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/801294fc-3723-4231-b7f8-6a3e650ab77c?what_not_true=true)
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/801294fc-3723-4231-b7f8-6a3e650ab77c?what_out_of_scope=true)
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/801294fc-3723-4231-b7f8-6a3e650ab77c?what_not_in_standard=true)
[](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/801294fc-3723-4231-b7f8-6a3e650ab77c)
</details>
<sub>
💬 Looking for more details? Reply to this comment to chat with Korbit.
</sub>
<!--- korbi internal id:d832f0c1-ab4b-419f-85cf-76468acaf851 -->
[](d832f0c1-ab4b-419f-85cf-76468acaf851)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]