Lahiru Jayathilake created AIRAVATA-3893:
--------------------------------------------
Summary: Support for Automatic Resubmission of Failed Jobs After
Successful Submission
Key: AIRAVATA-3893
URL: https://issues.apache.org/jira/browse/AIRAVATA-3893
Project: Airavata
Issue Type: Improvement
Components: Airavata System
Reporter: Lahiru Jayathilake
Currently, the Airavata Metascheduler does not have the capability to
automatically resubmit jobs to other clusters if the job has been successfully
submitted but fails during execution (e.g., due to resource allocation issues).
This feature request aims to enhance the Metascheduler by introducing the
ability to handle such job failures more effectively. The Metascheduler should
automatically attempt to resubmit failed jobs to other configured clusters,
ensuring more reliable completion of experiments.
This enhancement will improve the system’s robustness in handling transient
failures or resource constraints across multiple clusters.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)