I don't quite get what this example should to prove? I agree XML is too much 
for everything but CWL is not XML. And our point is not to bring a new pipeline 
manager we don't have any other but Airflow. We do not do conversion because we 
have to change CWL file to send them to others Universities/Hospitals/Companies 
to reproduce result and get improved CWL back to run it on our lovely Airflow. 
If we convert CWL to python and change it to python we later would have to 
convert it to CWL back. Our CWLDAG is just an Airflow's DAG and can be used as 
usual within any Airflow's pipelines 
with task.set_upstream or task.set_downstream to integrate CWL pipeline into 
bigger Airflow pipeline.
  
> 
> Speaking of oozie-2-airflow - as one of the creators of it. Here is the
> repo: https://github.com/GoogleCloudPlatform/oozie-to-airflow
> 
> And some basic high-level concepts for O2A:
> 
>    - The world becomes a better place - one XML less at a time. This was
>    our motto. From what we know - Ooozie XML workflows are pretty much
>    universally hated (well maybe just disliked) in the data processing world.
>    But there are people who have 1000s of oozie workflows, so we wanted to
>    give people a tool to easily converts many XMLs to Airflow DAGs.
>    - It's one-way conversion. The aim is to get Oozie XML workflows as
>    source, convert it to DAGs and continue working on the generated DAGs.
>    - We accept are not perfect. We know we can convert many Ooozie
>    Worfkflows and they will work but we have a number of limitations (
>    
> https://github.com/GoogleCloudPlatform/oozie-to-airflow#common-known-limitations).
>    We have also a list of issues:
>    https://github.com/GoogleCloudPlatform/oozie-to-airflow/issues that are
>    still open and need solving for some more sophisticated features of Ooozie.
>    But this is OK. We accept that we are not perfect - the DAGs generated by
>    O2A can be further manually modified and evolved as Python DAGs to add
>    missing features.
>    - The Python code generated by O2A are generated with the focus of
>    readability. Those python files are well written, formatted and structured
>    in a way that resembles as if a human programmer wrote it so that it is
>    easy to take them forward and modify them  manually. For example we can
>    then add all the Airflow-specific features Maxime mentioned - queues, xcom,
>    callbacks, etc. etc. ). So we do not have to be perfect.

Reply via email to