[ 
https://issues.apache.org/jira/browse/AIRFLOW-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Semet updated AIRFLOW-1847:
---------------------------
    Description: 
Webhook sensor. May require a hook in the experimental API
Register an api endpoint and wait for input on each.

It is different than the {{dag_runs}} api in that the format is not airflow 
specific, it is just a callback web url called by an external system on some 
even with its application specific content. The content in really important and 
need to be sent to the dag (as XCom?)

Use Case:
- A Dag registers a WebHook sensor named {{<webhookname>}}
- An custom endpoint is exposed at 
{{http://myairflow.server/api/experimental/webhook/<webhookname>}}.
- I set this URL in the external system I wish to use the webhook from. Ex: 
github/gitlab project webhook
- when the external application performs a request to this URL, this is 
automatically sent to the WebHook sensor. For simplicity, we can have a 
JsonWebHookSensor that would be able to carry any kind of json content.
- sensor only job would be normally to trigger the exection of a DAG, providing 
it with the json content as xcom.

If there are several requests at the same time, the system should be scalable 
enough to not die or not slow down the webui. It is also possible to 
instantiate an independant flask/gunicorn server to split the load. It would 
mean it runs on another port, but this could be just an option in the 
configuration file or even a complete independant application ({{airflow 
webhookserver}}). I saw recent changes integrated gunicorn in airflow core, 
guess it can help this use case.

!airflow-webhook-proposal.png|thumbnail!

  was:
Webhook sensor. May require a hook in the experimental API
Register an api endpoint and wait for input on each.

It is different than the {{dag_runs}} api in that the format is not airflow 
specific, it is just a callback web url called by an external system on some 
even with its application specific content. The content in really important and 
need to be sent to the dag (as XCom?)

Use Case:
- A Dag registers a WebHook sensor named {{<webhookname>}}
- An custom endpoint is exposed at 
{{http://myairflow.server/api/experimental/webhook/<webhookname>}}.
- I set this URL in the external system I wish to use the webhook from. Ex: 
github/gitlab project webhook
- when the external application performs a request to this URL, this is 
automatically sent to the WebHook sensor. For simplicity, we can have a 
JsonWebHookSensor that would be able to carry any kind of json content.
- sensor only job would be normally to trigger the exection of a DAG, providing 
it with the json content as xcom.

If there are several requests at the same time, the system should be scalable 
enough to not die or not slow down the webui. It is also possible to 
instantiate an independant flask/gunicorn server to split the load. It would 
mean it runs on another port, but this could be just an option in the 
configuration file or even a complete independant application ({{airflow 
webhookserver}}). I saw recent changes integrated gunicorn in airflow core, 
guess it can help this use case.


> Webhook Sensor
> --------------
>
>                 Key: AIRFLOW-1847
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1847
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: core, operators
>            Reporter: Semet
>            Assignee: Semet
>            Priority: Minor
>              Labels: api, sensors, webhook
>         Attachments: airflow-webhook-proposal.png
>
>
> Webhook sensor. May require a hook in the experimental API
> Register an api endpoint and wait for input on each.
> It is different than the {{dag_runs}} api in that the format is not airflow 
> specific, it is just a callback web url called by an external system on some 
> even with its application specific content. The content in really important 
> and need to be sent to the dag (as XCom?)
> Use Case:
> - A Dag registers a WebHook sensor named {{<webhookname>}}
> - An custom endpoint is exposed at 
> {{http://myairflow.server/api/experimental/webhook/<webhookname>}}.
> - I set this URL in the external system I wish to use the webhook from. Ex: 
> github/gitlab project webhook
> - when the external application performs a request to this URL, this is 
> automatically sent to the WebHook sensor. For simplicity, we can have a 
> JsonWebHookSensor that would be able to carry any kind of json content.
> - sensor only job would be normally to trigger the exection of a DAG, 
> providing it with the json content as xcom.
> If there are several requests at the same time, the system should be scalable 
> enough to not die or not slow down the webui. It is also possible to 
> instantiate an independant flask/gunicorn server to split the load. It would 
> mean it runs on another port, but this could be just an option in the 
> configuration file or even a complete independant application ({{airflow 
> webhookserver}}). I saw recent changes integrated gunicorn in airflow core, 
> guess it can help this use case.
> !airflow-webhook-proposal.png|thumbnail!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to