Daniel van der Ende created AIRFLOW-1658:
--------------------------------------------

             Summary: Kill (possibly) still running Druid indexing job after 
max timeout is exceeded
                 Key: AIRFLOW-1658
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1658
             Project: Apache Airflow
          Issue Type: Improvement
          Components: hooks
            Reporter: Daniel van der Ende
            Assignee: Daniel van der Ende
            Priority: Minor


Right now, the Druid hook contains a parameter max_ingestion_time. If the total 
execution time of the Druid indexing job exceeds this timeout, an 
AirflowException is thrown. However, this does not necessarily mean that the 
Druid task failed (a busy Hadoop cluster could also be to blame for slow 
performance for example). If the Airflow task is then retried, you end up with 
multiple Druid tasks performing the same work. 
To easily prevent this, we can call the shutdown endpoint on the task id that 
is still running.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to