The big question is why can't it just be on its own Github repository and
in its own PyPI package? Why does it have to be packaged with our PyPI
package or live in the Airflow repo?

Max

On Wed, Nov 13, 2019 at 12:42 PM Andrey Kartashov <por...@porter.st> wrote:

> I don't quite get what this example should to prove? I agree XML is too
> much for everything but CWL is not XML. And our point is not to bring a new
> pipeline manager we don't have any other but Airflow. We do not do
> conversion because we have to change CWL file to send them to others
> Universities/Hospitals/Companies to reproduce result and get improved CWL
> back to run it on our lovely Airflow. If we convert CWL to python and
> change it to python we later would have to convert it to CWL back. Our
> CWLDAG is just an Airflow's DAG and can be used as usual within any
> Airflow's pipelines
> with task.set_upstream or task.set_downstream to integrate CWL pipeline
> into bigger Airflow pipeline.
>
> >
> > Speaking of oozie-2-airflow - as one of the creators of it. Here is the
> > repo: https://github.com/GoogleCloudPlatform/oozie-to-airflow
> >
> > And some basic high-level concepts for O2A:
> >
> >    - The world becomes a better place - one XML less at a time. This was
> >    our motto. From what we know - Ooozie XML workflows are pretty much
> >    universally hated (well maybe just disliked) in the data processing
> world.
> >    But there are people who have 1000s of oozie workflows, so we wanted
> to
> >    give people a tool to easily converts many XMLs to Airflow DAGs.
> >    - It's one-way conversion. The aim is to get Oozie XML workflows as
> >    source, convert it to DAGs and continue working on the generated DAGs.
> >    - We accept are not perfect. We know we can convert many Ooozie
> >    Worfkflows and they will work but we have a number of limitations (
> >
> https://github.com/GoogleCloudPlatform/oozie-to-airflow#common-known-limitations
> ).
> >    We have also a list of issues:
> >    https://github.com/GoogleCloudPlatform/oozie-to-airflow/issues that
> are
> >    still open and need solving for some more sophisticated features of
> Ooozie.
> >    But this is OK. We accept that we are not perfect - the DAGs
> generated by
> >    O2A can be further manually modified and evolved as Python DAGs to add
> >    missing features.
> >    - The Python code generated by O2A are generated with the focus of
> >    readability. Those python files are well written, formatted and
> structured
> >    in a way that resembles as if a human programmer wrote it so that it
> is
> >    easy to take them forward and modify them  manually. For example we
> can
> >    then add all the Airflow-specific features Maxime mentioned - queues,
> xcom,
> >    callbacks, etc. etc. ). So we do not have to be perfect.
>
>

Reply via email to